Voikko
Main class of the library.
Table of Contents
- __construct() : mixed
- Initialises the library for use in the specified language, adding an extra directory to the standard dictionary search path.
- __destruct() : mixed
- analyzeWord() : array<int, Analysis>
- Analyzes the morphology of given word.
- dictionaries() : array<int, Dictionary>
- Get a list of available dictionaries.
- grammarErrors() : array<int, GrammarError>
- Check grammar errors in a paragraph or sentence.
- hyphenate() : string
- Hyphenates the given word.
- hyphenationPattern() : string
- Return hyphenation pattern for the given word.
- sentences() : array<int, Sentence>
- Split the given text into sentences.
- setAcceptAllUppercase() : void
- Accept words even when all of the letters are in uppercase.
- setAcceptBulletedListsInGc() : void
- Accept paragraphs if they would be valid within bulleted lists (grammar checking only)
- setAcceptExtraHyphens() : void
- Allow some extra hyphens in words (spell checking only)
- setAcceptFirstUppercase() : void
- Accept words even when the first letter is in uppercase (start of sentence etc.)
- setAcceptMissingHyphens() : void
- Accept missing hyphens at the start and end of the word (spell checking only)
- setAcceptTitlesInGc() : void
- Accept incomplete sentences that could occur in titles or headings (grammar checking only)
- setAcceptUnfinishedParagraphsInGc() : void
- Accept incomplete sentences at the end of the paragraph (grammar checking only)
- setHyphenateUnknownWords() : void
- Hyphenate unknown words (hyphenation only)
- setIgnoreDot() : void
- Ignore dot at the end of the word (needed for use in some word processors).
- setIgnoreNonwords() : void
- Ignore non-words such as URLs and email addresses (spell checking only)
- setIgnoreNumbers() : void
- Ignore words containing numbers (spell checking only)
- setIgnoreUppercase() : void
- Accept words that are written completely in uppercase letters without checking them at all.
- setMinHyphenatedWordLength() : void
- Set the minimum length for words that may be hyphenated.
- setNoUglyHyphenation() : void
- Do not insert hyphenation positions that are considered to be ugly but correct
- setOcrSuggestions() : void
- Use suggestions optimized for optical character recognition software.
- setSpellerCacheSize() : void
- Set the size of the spell checker cache.
- spell() : bool
- Checks the spelling of the given word.
- suggest() : array<int, string>
- Finds suggested correct spellings for the given word.
- tokens() : array<int, Token>
- Split the given text into tokens.
Methods
__construct()
Initialises the library for use in the specified language, adding an extra directory to the standard dictionary search path.
public
__construct([string $languageCode = 'fi' ][, string $dictionaryPath = null ][, string $libraryPath = "libvoikko.so.1" ]) : mixed
Parameters
- $languageCode : string = 'fi'
-
BCP 47 language tag for the language to be used. Private use subtags can be used to specify the dictionary variant.
- $dictionaryPath : string = null
-
Path to a directory from which dictionary files should be searched first before looking into the standard dictionary locations. If null, no additional search path will be used.
- $libraryPath : string = "libvoikko.so.1"
-
Path to libvoikko shared library.
Tags
Return values
mixed —__destruct()
public
__destruct() : mixed
Return values
mixed —analyzeWord()
Analyzes the morphology of given word.
public
analyzeWord(string $word) : array<int, Analysis>
Parameters
- $word : string
-
Word to be analyzed.
Return values
array<int, Analysis> —Array of analysis results. Empty array is returned for unknown words.
dictionaries()
Get a list of available dictionaries.
public
static dictionaries([string $dictionaryPath = null ][, string $libraryPath = "libvoikko.so.1" ]) : array<int, Dictionary>
Parameters
- $dictionaryPath : string = null
-
Path to a directory from which dictionary files should be searched first before looking into the standard dictionary locations.
- $libraryPath : string = "libvoikko.so.1"
-
Path to libvoikko shared library.
Return values
array<int, Dictionary> —Array of dictionaries
grammarErrors()
Check grammar errors in a paragraph or sentence.
public
grammarErrors(string $text[, string $languageCode = 'en' ]) : array<int, GrammarError>
Parameters
- $text : string
-
A paragraph or sentence to check grammar errors in.
- $languageCode : string = 'en'
-
ISO language code for the language in which error descriptions should be returned
Return values
array<int, GrammarError> —Array of grammar errors
hyphenate()
Hyphenates the given word.
public
hyphenate(string $word[, string $hyphen = '-' ][, bool $allowContextChanges = true ]) : string
Parameters
- $word : string
-
word to hyphenate
- $hyphen : string = '-'
-
character string to insert at hyphenation positions
- $allowContextChanges : bool = true
-
Whether hyphens may be inserted even if they alter the word in unhyphenated form.
Return values
string —Hyphenated word
hyphenationPattern()
Return hyphenation pattern for the given word.
public
hyphenationPattern(string $word) : string
The hyphenation pattern uses the following notation:
' ' = no hyphenation at this character
'-' = hyphenation point (character at this position
is preserved in the hyphenated form)
'=' = hyphenation point (character at this position
is replaced by the hyphen)
Parameters
- $word : string
-
Word to hyphenate
Return values
string —Hyphenation pattern
sentences()
Split the given text into sentences.
public
sentences(string $text) : array<int, Sentence>
Parameters
- $text : string
-
Text to split into sentences.
Return values
array<int, Sentence> —Array of sentences
setAcceptAllUppercase()
Accept words even when all of the letters are in uppercase.
public
setAcceptAllUppercase(bool $value) : void
Note that this is not the same as setIgnoreUppercase
: with this option
the word is still checked, only case differences are ignored.
Default: true
Parameters
- $value : bool
Return values
void —setAcceptBulletedListsInGc()
Accept paragraphs if they would be valid within bulleted lists (grammar checking only)
public
setAcceptBulletedListsInGc(bool $value) : void
Default: false
Parameters
- $value : bool
Return values
void —setAcceptExtraHyphens()
Allow some extra hyphens in words (spell checking only)
public
setAcceptExtraHyphens(bool $value) : void
This option relaxes hyphen checking rules to work around some unresolved issues in the underlying morphology, but it may cause some incorrect words to be accepted. The exact behavior (if any) of this option is not specified.
Default: false
Parameters
- $value : bool
Return values
void —setAcceptFirstUppercase()
Accept words even when the first letter is in uppercase (start of sentence etc.)
public
setAcceptFirstUppercase(bool $value) : void
Default: true
Parameters
- $value : bool
Return values
void —setAcceptMissingHyphens()
Accept missing hyphens at the start and end of the word (spell checking only)
public
setAcceptMissingHyphens(bool $value) : void
Some application programs do not consider hyphens to be word characters. This is reasonable assumption for many languages but not for Finnish. If the application cannot be fixed to use proper tokenisation algorithm for Finnish, this option may be used to tell libvoikko to work around this defect.
Default: false
Parameters
- $value : bool
Return values
void —setAcceptTitlesInGc()
Accept incomplete sentences that could occur in titles or headings (grammar checking only)
public
setAcceptTitlesInGc(bool $value) : void
Set this option to true if your application is not able to differentiate titles from normal text paragraphs, or if you know that you are checking title text.
Default: false
Parameters
- $value : bool
Return values
void —setAcceptUnfinishedParagraphsInGc()
Accept incomplete sentences at the end of the paragraph (grammar checking only)
public
setAcceptUnfinishedParagraphsInGc(bool $value) : void
These may exist when text is still being written.
Default: false
Parameters
- $value : bool
Return values
void —setHyphenateUnknownWords()
Hyphenate unknown words (hyphenation only)
public
setHyphenateUnknownWords(bool $value) : void
Default: true
Parameters
- $value : bool
Return values
void —setIgnoreDot()
Ignore dot at the end of the word (needed for use in some word processors).
public
setIgnoreDot(bool $value) : void
If this option is set and input word ends with a dot, spell checking and hyphenation functions try to analyze the word without the dot if no results can be obtained for the original form. Also with this option, string tokenizer will consider trailing dot of a word to be a part of that word.
Default: false
Parameters
- $value : bool
Return values
void —setIgnoreNonwords()
Ignore non-words such as URLs and email addresses (spell checking only)
public
setIgnoreNonwords(bool $value) : void
Default: true
Parameters
- $value : bool
Return values
void —setIgnoreNumbers()
Ignore words containing numbers (spell checking only)
public
setIgnoreNumbers(bool $value) : void
Default: false
Parameters
- $value : bool
Return values
void —setIgnoreUppercase()
Accept words that are written completely in uppercase letters without checking them at all.
public
setIgnoreUppercase(bool $value) : void
Default: false
Parameters
- $value : bool
Return values
void —setMinHyphenatedWordLength()
Set the minimum length for words that may be hyphenated.
public
setMinHyphenatedWordLength(int $value) : void
This limit is also enforced on individual parts of compound words.
Default: 2
Parameters
- $value : int
Return values
void —setNoUglyHyphenation()
Do not insert hyphenation positions that are considered to be ugly but correct
public
setNoUglyHyphenation(bool $value) : void
Default: false
Parameters
- $value : bool
Return values
void —setOcrSuggestions()
Use suggestions optimized for optical character recognition software.
public
setOcrSuggestions(bool $value) : void
By default suggestions are optimized for typing errors.
Default: false
Parameters
- $value : bool
Return values
void —setSpellerCacheSize()
Set the size of the spell checker cache.
public
setSpellerCacheSize(int $value) : void
This can be -1 (no cache) or >= 0 (size in bytes = 2^cache_size * (6544*sizeof(wchar_t) + 1008)
).
Default: 0
Parameters
- $value : int
Return values
void —spell()
Checks the spelling of the given word.
public
spell(string $word) : bool
Parameters
- $word : string
-
Word to check
Tags
Return values
bool —Whether the spelling is correct or not
suggest()
Finds suggested correct spellings for the given word.
public
suggest(string $word) : array<int, string>
Parameters
- $word : string
-
Word to find suggestions for
Return values
array<int, string> —Array of suggestions
tokens()
Split the given text into tokens.
public
tokens(string $text) : array<int, Token>
Parameters
- $text : string
-
Text to split into tokens.
Return values
array<int, Token> —Array of tokens