Package opennlp.tools.dictionary
Class Dictionary
java.lang.Object
opennlp.tools.dictionary.Dictionary
- All Implemented Interfaces:
Iterable<StringList>,SerializableArtifact
An iterable and serializable dictionary implementation.
- See Also:
-
Constructor Summary
ConstructorsConstructorDescriptionInitializes an emptyDictionary.Dictionary(boolean caseSensitive) Initializes an emptyDictionary.Initializes theDictionaryfrom an existing dictionary resource. -
Method Summary
Modifier and TypeMethodDescriptionConverts thisDictionaryto aSet<String>.booleancontains(StringList tokens) Checks if this dictionary has the given entry.booleanClass<?>Retrieves the class which can serialize and recreate this artifact.intintinthashCode()booleaniterator()static DictionaryReads aDictionarywhich has one entry per line.voidput(StringList tokens) Adds the tokens to the dictionary as one new entry.voidremove(StringList tokens) Removes the given tokens form the current instance.voidserialize(OutputStream out) Writes the current instance to the givenOutputStream.intsize()toString()Methods inherited from interface java.lang.Iterable
forEach, spliterator
-
Constructor Details
-
Dictionary
public Dictionary()Initializes an emptyDictionary. By default, the resulting instance will not be case-sensitive. -
Dictionary
public Dictionary(boolean caseSensitive) Initializes an emptyDictionary.- Parameters:
caseSensitive- Whether the new instance will operate case-sensitive, or not.
-
Dictionary
Initializes theDictionaryfrom an existing dictionary resource.- Parameters:
in- TheInputStreamthat references the dictionary content.- Throws:
IOException- Thrown if IO errors occurred.
-
-
Method Details
-
put
Adds the tokens to the dictionary as one new entry.- Parameters:
tokens- the new entry
-
getMinTokenCount
public int getMinTokenCount() -
getMaxTokenCount
public int getMaxTokenCount() -
contains
Checks if this dictionary has the given entry.- Parameters:
tokens- The query of tokens to be checked for.- Returns:
trueif it contains the entry,falseotherwise.
-
remove
Removes the given tokens form the current instance.- Parameters:
tokens- The tokens to be filtered out (= removed).
-
iterator
- Specified by:
iteratorin interfaceIterable<StringList>- Returns:
- Retrieves a token-
Iteratorover all elements.
-
size
public int size()- Returns:
- Retrieves the number of tokens in the current instance.
-
serialize
Writes the current instance to the givenOutputStream.- Parameters:
out- A validOutputStream, ready for serialization.- Throws:
IOException- Thrown if IO errors occurred.
-
equals
-
hashCode
public int hashCode() -
toString
-
parseOneEntryPerLine
Reads aDictionarywhich has one entry per line. The tokens inside an entry are whitespace delimited.- Parameters:
in- AReaderinstance used to parse the dictionary from.- Returns:
- The parsed
Dictionaryinstance; guaranteed to be non-null. - Throws:
IOException- Thrown if IO errors occurred during read and parse operations.
-
asStringSet
Converts thisDictionaryto aSet<String>.Note: Only
AbstractCollection.iterator(),AbstractCollection.size()andAbstractCollection.contains(Object)methods are implemented.If this dictionary entries are multi tokens only the first token of the entry will be part of the
Set.- Returns:
- A
Setcontaining all entries of thisDictionary.
-
getArtifactSerializerClass
Description copied from interface:SerializableArtifactRetrieves the class which can serialize and recreate this artifact.Note: The serializer class must have a
public zero argument constructoror an exception is thrown during model serialization/loading.- Specified by:
getArtifactSerializerClassin interfaceSerializableArtifact- Returns:
- Retrieves the serializer class for
Dictionary - See Also:
-
isCaseSensitive
public boolean isCaseSensitive()- Returns:
true, if thisDictionaryis case-sensitive.
-