Linux-Bulgaria.ORG
навигация

 

начало

пощенски списък

архив на групата

семинари ...

документи

как да ...

 

 

Предишно писмо Следващо писмо Предишно по тема Следващо по тема По Дата По тема (thread)

Re: [SOLUTION] Re: lug-bg: utf,ansi,unicode etc...


  • Subject: Re: [SOLUTION] Re: lug-bg: utf,ansi,unicode etc...
  • From: raptor <raptor@xxxxxxxxxx>
  • Date: Mon, 11 Aug 2003 16:02:11 +0300

|a tova e dobre ;-) no kakto kazah nqma 100% strict method da se otgatne 
|kodiraneto na tova koeto podavash ... ima nesto symnitelno tuka pri 
|detect-vaneto, t.e. nqmame 100% garanciq franciq 4e ste ucelim input 
|encoding-a: 
|http://search.cpan.org/author/JNEYSTADT/cyrillic-1.05/Lingua/DetectCharset.pm
|This routine is implemented using algorithm of statistical analysis of text, 
|which was proved to be very efficient and showed around 99.98% acccuracy in 
|tests.
|
|Ako znaem input encoding-a, posle konviertiraneto gore dolu e lesno imajki 
|predvid izklu4eniqta za "symbols-out-of-range" ;-)
]- poglednah modula, pichagata otkriwa mnogo hitro encodinga... nai weroqtno e pusnal statistical analiz na nqkakwi tekstowe (weroqtno ruski, pyk znaesh li move da e porbwal wsichki kirilski ezici :") ) i wsichki wazmovni dwubukweni poredici poluchawat teglo... kolkoto po chesto dwe-bukwi (edna do druga) se sreshtat tolkowa po "tevki" sa..
I kato prowerqwa teksta posle, pri koito ot charsetowete se poluchi po golqma weroqtnost/teglo nego izbira...
Predpolagam che ako se naprawi syshtoto nesto za BG text, ste otgatwa po dobre bg-encoding... ama dokolkoto znam nqma podobni na word-"corpusi" za bulgarski ezik... (i nie sme cheli malko za linguisics :") )
Ako "corpus-a" e dostatychno golqm i da obhwashta poweche oblasti naisitna move da ima 99.98% tochnost..

raptor

============================================================================
A mail-list of Linux Users Group - Bulgaria (bulgarian linuxers).
http://www.linux-bulgaria.org - Hosted by Internet Group Ltd. - Stara Zagora
To unsubscribe: http://www.linux-bulgaria.org/public/mail_list.html
============================================================================



 

наши приятели

 

линукс за българи
http://linux-bg.org

FSA-BG
http://fsa-bg.org

OpenFest
http://openfest.org

FreeBSD BG
http://bg-freebsd.org

KDE-BG
http://kde.fsa-bg.org/

Gnome-BG
http://gnome.cult.bg/

проект OpenFMI
http://openfmi.net

NetField Forum
http://netField.ludost.net/forum/

 

 

Linux-Bulgaria.ORG

Mailing list messages are © Copyright their authors.