Documentation

Voikko

Main class of the library.

Table of Contents

__construct()  : mixed
Initialises the library for use in the specified language, adding an extra directory to the standard dictionary search path.
__destruct()  : mixed
analyzeWord()  : array<int, Analysis>
Analyzes the morphology of given word.
dictionaries()  : array<int, Dictionary>
Get a list of available dictionaries.
grammarErrors()  : array<int, GrammarError>
Check grammar errors in a paragraph or sentence.
hyphenate()  : string
Hyphenates the given word.
hyphenationPattern()  : string
Return hyphenation pattern for the given word.
sentences()  : array<int, Sentence>
Split the given text into sentences.
setAcceptAllUppercase()  : void
Accept words even when all of the letters are in uppercase.
setAcceptBulletedListsInGc()  : void
Accept paragraphs if they would be valid within bulleted lists (grammar checking only)
setAcceptExtraHyphens()  : void
Allow some extra hyphens in words (spell checking only)
setAcceptFirstUppercase()  : void
Accept words even when the first letter is in uppercase (start of sentence etc.)
setAcceptMissingHyphens()  : void
Accept missing hyphens at the start and end of the word (spell checking only)
setAcceptTitlesInGc()  : void
Accept incomplete sentences that could occur in titles or headings (grammar checking only)
setAcceptUnfinishedParagraphsInGc()  : void
Accept incomplete sentences at the end of the paragraph (grammar checking only)
setHyphenateUnknownWords()  : void
Hyphenate unknown words (hyphenation only)
setIgnoreDot()  : void
Ignore dot at the end of the word (needed for use in some word processors).
setIgnoreNonwords()  : void
Ignore non-words such as URLs and email addresses (spell checking only)
setIgnoreNumbers()  : void
Ignore words containing numbers (spell checking only)
setIgnoreUppercase()  : void
Accept words that are written completely in uppercase letters without checking them at all.
setMinHyphenatedWordLength()  : void
Set the minimum length for words that may be hyphenated.
setNoUglyHyphenation()  : void
Do not insert hyphenation positions that are considered to be ugly but correct
setOcrSuggestions()  : void
Use suggestions optimized for optical character recognition software.
setSpellerCacheSize()  : void
Set the size of the spell checker cache.
spell()  : bool
Checks the spelling of the given word.
suggest()  : array<int, string>
Finds suggested correct spellings for the given word.
tokens()  : array<int, Token>
Split the given text into tokens.

Methods

__construct()

Initialises the library for use in the specified language, adding an extra directory to the standard dictionary search path.

public __construct([string $languageCode = 'fi' ][, string $dictionaryPath = null ][, string $libraryPath = "libvoikko.so.1" ]) : mixed
Parameters
$languageCode : string = 'fi'

BCP 47 language tag for the language to be used. Private use subtags can be used to specify the dictionary variant.

$dictionaryPath : string = null

Path to a directory from which dictionary files should be searched first before looking into the standard dictionary locations. If null, no additional search path will be used.

$libraryPath : string = "libvoikko.so.1"

Path to libvoikko shared library.

Tags
throws
Exception

If initialization failed.

Return values
mixed

__destruct()

public __destruct() : mixed
Return values
mixed

analyzeWord()

Analyzes the morphology of given word.

public analyzeWord(string $word) : array<int, Analysis>
Parameters
$word : string

Word to be analyzed.

Return values
array<int, Analysis>

Array of analysis results. Empty array is returned for unknown words.

dictionaries()

Get a list of available dictionaries.

public static dictionaries([string $dictionaryPath = null ][, string $libraryPath = "libvoikko.so.1" ]) : array<int, Dictionary>
Parameters
$dictionaryPath : string = null

Path to a directory from which dictionary files should be searched first before looking into the standard dictionary locations.

$libraryPath : string = "libvoikko.so.1"

Path to libvoikko shared library.

Return values
array<int, Dictionary>

Array of dictionaries

grammarErrors()

Check grammar errors in a paragraph or sentence.

public grammarErrors(string $text[, string $languageCode = 'en' ]) : array<int, GrammarError>
Parameters
$text : string

A paragraph or sentence to check grammar errors in.

$languageCode : string = 'en'

ISO language code for the language in which error descriptions should be returned

Return values
array<int, GrammarError>

Array of grammar errors

hyphenate()

Hyphenates the given word.

public hyphenate(string $word[, string $hyphen = '-' ][, bool $allowContextChanges = true ]) : string
Parameters
$word : string

word to hyphenate

$hyphen : string = '-'

character string to insert at hyphenation positions

$allowContextChanges : bool = true

Whether hyphens may be inserted even if they alter the word in unhyphenated form.

Return values
string

Hyphenated word

hyphenationPattern()

Return hyphenation pattern for the given word.

public hyphenationPattern(string $word) : string

The hyphenation pattern uses the following notation:

' ' = no hyphenation at this character
'-' = hyphenation point (character at this position
      is preserved in the hyphenated form)
'=' = hyphenation point (character at this position
      is replaced by the hyphen)
Parameters
$word : string

Word to hyphenate

Return values
string

Hyphenation pattern

sentences()

Split the given text into sentences.

public sentences(string $text) : array<int, Sentence>
Parameters
$text : string

Text to split into sentences.

Return values
array<int, Sentence>

Array of sentences

setAcceptAllUppercase()

Accept words even when all of the letters are in uppercase.

public setAcceptAllUppercase(bool $value) : void

Note that this is not the same as setIgnoreUppercase: with this option the word is still checked, only case differences are ignored.

Default: true

Parameters
$value : bool
Return values
void

setAcceptBulletedListsInGc()

Accept paragraphs if they would be valid within bulleted lists (grammar checking only)

public setAcceptBulletedListsInGc(bool $value) : void

Default: false

Parameters
$value : bool
Return values
void

setAcceptExtraHyphens()

Allow some extra hyphens in words (spell checking only)

public setAcceptExtraHyphens(bool $value) : void

This option relaxes hyphen checking rules to work around some unresolved issues in the underlying morphology, but it may cause some incorrect words to be accepted. The exact behavior (if any) of this option is not specified.

Default: false

Parameters
$value : bool
Return values
void

setAcceptFirstUppercase()

Accept words even when the first letter is in uppercase (start of sentence etc.)

public setAcceptFirstUppercase(bool $value) : void

Default: true

Parameters
$value : bool
Return values
void

setAcceptMissingHyphens()

Accept missing hyphens at the start and end of the word (spell checking only)

public setAcceptMissingHyphens(bool $value) : void

Some application programs do not consider hyphens to be word characters. This is reasonable assumption for many languages but not for Finnish. If the application cannot be fixed to use proper tokenisation algorithm for Finnish, this option may be used to tell libvoikko to work around this defect.

Default: false

Parameters
$value : bool
Return values
void

setAcceptTitlesInGc()

Accept incomplete sentences that could occur in titles or headings (grammar checking only)

public setAcceptTitlesInGc(bool $value) : void

Set this option to true if your application is not able to differentiate titles from normal text paragraphs, or if you know that you are checking title text.

Default: false

Parameters
$value : bool
Return values
void

setAcceptUnfinishedParagraphsInGc()

Accept incomplete sentences at the end of the paragraph (grammar checking only)

public setAcceptUnfinishedParagraphsInGc(bool $value) : void

These may exist when text is still being written.

Default: false

Parameters
$value : bool
Return values
void

setHyphenateUnknownWords()

Hyphenate unknown words (hyphenation only)

public setHyphenateUnknownWords(bool $value) : void

Default: true

Parameters
$value : bool
Return values
void

setIgnoreDot()

Ignore dot at the end of the word (needed for use in some word processors).

public setIgnoreDot(bool $value) : void

If this option is set and input word ends with a dot, spell checking and hyphenation functions try to analyze the word without the dot if no results can be obtained for the original form. Also with this option, string tokenizer will consider trailing dot of a word to be a part of that word.

Default: false

Parameters
$value : bool
Return values
void

setIgnoreNonwords()

Ignore non-words such as URLs and email addresses (spell checking only)

public setIgnoreNonwords(bool $value) : void

Default: true

Parameters
$value : bool
Return values
void

setIgnoreNumbers()

Ignore words containing numbers (spell checking only)

public setIgnoreNumbers(bool $value) : void

Default: false

Parameters
$value : bool
Return values
void

setIgnoreUppercase()

Accept words that are written completely in uppercase letters without checking them at all.

public setIgnoreUppercase(bool $value) : void

Default: false

Parameters
$value : bool
Return values
void

setMinHyphenatedWordLength()

Set the minimum length for words that may be hyphenated.

public setMinHyphenatedWordLength(int $value) : void

This limit is also enforced on individual parts of compound words.

Default: 2

Parameters
$value : int
Return values
void

setNoUglyHyphenation()

Do not insert hyphenation positions that are considered to be ugly but correct

public setNoUglyHyphenation(bool $value) : void

Default: false

Parameters
$value : bool
Return values
void

setOcrSuggestions()

Use suggestions optimized for optical character recognition software.

public setOcrSuggestions(bool $value) : void

By default suggestions are optimized for typing errors.

Default: false

Parameters
$value : bool
Return values
void

setSpellerCacheSize()

Set the size of the spell checker cache.

public setSpellerCacheSize(int $value) : void

This can be -1 (no cache) or >= 0 (size in bytes = 2^cache_size * (6544*sizeof(wchar_t) + 1008)).

Default: 0

Parameters
$value : int
Return values
void

spell()

Checks the spelling of the given word.

public spell(string $word) : bool
Parameters
$word : string

Word to check

Tags
throws
Exception

on error

Return values
bool

Whether the spelling is correct or not

suggest()

Finds suggested correct spellings for the given word.

public suggest(string $word) : array<int, string>
Parameters
$word : string

Word to find suggestions for

Return values
array<int, string>

Array of suggestions

tokens()

Split the given text into tokens.

public tokens(string $text) : array<int, Token>
Parameters
$text : string

Text to split into tokens.

Return values
array<int, Token>

Array of tokens

Search results