Delphi Inspiration

Components and Applications

User Tools

Site Tools


products:stemmer:history

YuStemmer: Version History

YuStemmer is a natural language stemming library for 15 languages. It reduces an inflected word to a common root form. YuStemmer is algorithmic, which makes it small and fast.

YuStemmer 5.0.0 – 21 Mar 2019

  • New stemmers:
    1. Greek: TYuStemmer_Greek_8, TYuStemmer_Greek_16.
    2. Indonesian: TYuStemmer_Indonesian, TYuStemmer_Indonesian_8, TYuStemmer_Indonesian_16.
    3. Lithuanian: TYuStemmer_Lithuanian_8, TYuStemmer_Lithuanian_16.
    4. Nepali: TYuStemmer_Nepali_8, TYuStemmer_Nepali_16.
  • Danish and Finish stemmers no longer mangle numbers. They now define “consonant” more tightly than just “not a vowel”, which means numbers don't get truncated, and also tends to leave foreign words alone. Search indexes need updating.
  • French stemmer recognizes suffixes that begin with diaereses. Search indexes need updating.
  • Latin stemmer uses a single tab (instead of mulitple white space) to separate the noun and verb form of the result. Code might need updating.
  • Russian stemmer normalizes 'Ñ‘' and maps it to 'е'. Search indexes need updating.
  • Turkish stemmer runs up to 11% faster by checking for 'ad' or 'soyad' more efficiently. Search indexes need updating.
  • Fix handling of 3-byte UTF-8 sequences, plus handle 4-byte UTF-8 sequences.
  • Create simpler code for and improve several stemmers.

YuStemmer 4.1.0 – 24 Dec 2018

  • Support Delphi 10.3 Rio Win32 and Win64.

YuStemmer 4.0.0 – 3 Apr 2017

  • Support Delphi 10.2 Tokyo Win32 and Win64.
  • New stemmers:
    • Arabic: TYuStemmer_Arabic_8, TYuStemmer_Arabic_16.
    • Kraaij Pohlmann (Dutch): TYuStemmer_Kraaij_Pohlmann, TYuStemmer_Kraaij_Pohlmann_8, TYuStemmer_Kraaij_Pohlmann_16.
    • Latin: TYuStemmer_Latin, TYuStemmer_Latin_8, TYuStemmer_Latin_16.
    • Lovins (English): TYuStemmer_Lovins, TYuStemmer_Lovins_8, TYuStemmer_Lovins_16.
    • Slovene: TYuStemmer_Slovene_8, TYuStemmer_Slovene_16.
    • Tamil: TYuStemmer_Tamil_8, TYuStemmer_Tamil_16.
  • Fix TYuStemmer_Czech_8 and TYuStemmer_Czech_16 to handle Unicode properly.
  • Portuguese stemmer fix: Replace Spanish suffixes with Portuguese ones.
  • Greately expand test cases.

YuStemmer 3.7.0 – 7 May 2016

  • Support Delphi 10.1 Berlin Win32 and Win64.

YuStemmer 3.6.2 – 15 Sep 2015

  • Support Delphi 10 Seattle Win32 and Win64.

YuStemmer 3.6.1 – 25 Apr 2015

  • Add support for Delphi XE8 Win32 and Win64.

YuStemmer 3.6.0 – 3 Oct 2014

  • Support Delphi XE7 Win32 and Win64.
  • New Stemmers:
    • Armenian: TYuStemmer_Armenian_8, TYuStemmer_Armenian_16.
    • Basque: TYuStemmer_Basque, TYuStemmer_Basque_8, TYuStemmer_Basque_16.
    • Catalan: TYuStemmer_Catalan, TYuStemmer_Catalan_8, TYuStemmer_Catalan_16.
    • Czech: TYuStemmer_Czech, TYuStemmer_Czech_8, TYuStemmer_Czech_16.
    • Irish: TYuStemmer_Irish, TYuStemmer_Irish_8, TYuStemmer_Irish_16.
  • Hungarian stemmer TYuStemmer_Hungarian now expects ISO 8859-2 instead of ISO 8859-1 encoded strings.

YuStemmer 3.5.0 – 28 Apr 2014

  • Support Delphi XE6 Win32 and Win64.

YuStemmer 3.0.0 – 25 Sep 2013

  • Support Delphi XE5 Win32 and Win64.

YuStemmer 2.6.0 – 14 Jun 2013

  • Support Delphi XE4 Win32 and Win64.

YuStemmer 2.5.0 – 4 Oct 2012

  • Support Delphi XE3 Win32 and Win64.

YuStemmer 2.1.0 – 8 Nov 2011

  • Support Delphi XE2 Win64.

YuStemmer 2.0.0 – 15 Oct 2011

  • Support Delphi XE2 Win32.

YuStemmer 1.1.0 – 28 Sep 2010

  • Delphi XE support.
  • German stemmers: Add a new rule to reduce -nisse (and -nissen and -nisses) to -nis. This improves the stemming of “Kürbisse”, for example, which was reduced to “Kürbiss” and not “Kürbis”. Database tokenizers need to rebuild their indexes if they use any of the German stemmers.

YuStemmer 1.0.0 – 5 Dec 2009

  • First release.
products/stemmer/history.txt · Last modified: 2019/03/20 12:25 (external edit)