Skip to content

Releases: columbia-applied-data-science/rosetta

Streamers split and LDA results speedups.

18 Aug 23:25
Compare
Choose a tag to compare
  • LDAResults improvements including speedups and better memory management
  • Split up the streamers modules for easier dependency and imports.

Parallelized VW methods for general Text Streaming

03 Apr 16:23
Compare
Choose a tag to compare

new parallel_easy utils for memory friendly iterator functionality
new threading_easy utls for easy multi_threading
VW methods are parallelized for generic text streamers
protected import statements for non-standard libs

.to_scipysparse method added for streamers and bug fixes

20 Feb 20:15
Compare
Choose a tag to compare

New Streamers, enhanced nlp

18 Feb 17:33
Compare
Choose a tag to compare
  • new TextStreamer class to handle general text stream processes
  • explicit doc path passing option in TextFileStreamer
  • updated version of nlp.word_tokenize
  • minor bug fixes

Improved EDA

09 Dec 01:58
Compare
Choose a tag to compare

Major improvements to the modeling.eda module.

Improved LDA prediction

07 Dec 18:27
Compare
Choose a tag to compare
  • improved LDA predict function
  • improved documentation
  • removed old dependencies