Started 1 day, 1 hour ago (2009-12-15 22:10:00)
by reckb
syam_mp wrote:
> Hi All,
> I am new user wanting to use LingPipe for
prototyping a tool which will help
me
> 1. Create a topic list from existing text data.
>
I suggest you do Latent Dirichlet allocation (LDA) as per the
tutorial in:
http://alias-i.com/lingpipe/demos/tutorial/cluster /read-me.html
Topic summaries/labels can be suggested by the highest ranked words in
the identified...
Started 2 weeks, 2 days ago (2009-11-30 16:28:00)
by colloquialdo...
There are other
subtle differences, but the main
difference is that tokenizers aren't quite as heavy
or as general as chunkers.
Chunkers are heavier because they return Chunking
objects, which contain sets of Chunk objects,
which contain starts, ends, types, and scores.
You can't get the string spanned by a chunk back
from the chunk itself -- you need the char sequence
underlying the ...
Started 3 weeks, 4 days ago (2009-11-21 18:01:00)
by reckb
Brian Frutchey wrote:
> I am hoping to improve the performance of my LingPipe Named Entity Recognition
tests. Currently I am only able to extract entities from about 5K of text/sec
using either of the below methods:
>
> {code}
> Chunker chunker =
> (Chunker)AbstractExternalizable.readObject(new
> File("ne-en-news-muc6.AbstractCharLmRescoringChunk er"));
>
Chunking entities = chunker.chunk...
Started 1 month, 2 weeks ago (2009-10-27 16:58:00)
by colloquialdo...
I'm afraid to say there is no "running LingPipe".
LingPipe is just a set of Java APIs. That means
you need to write Java code to access LingPipe's
functionality. (In this way, it's just like
Lucene.)
The commands (.bat and .sh files) are just there
for demo purposes. You can look at the code for
them in $LINGPIPE/demos/generic and design your
own commands or embed the same processing ...
Started 1 month, 4 weeks ago (2009-10-19 03:19:00)
by prasen_bea
thanks for the quick response. your results make much more sense. May
be I am doing something wrong with the lingpipe package and thats why
found it difficult to interpret the lingpipe output :
And my U,V matrices are :
>> U:
0 1
0 -0.69 0.65
1 0.0539 -0.205
2 -0.721 0.728
>> V:
0 1
0 -0.74 -0.141
1 0.35 -0.613
2 -0.494 0.776
3 0.271 0.0207
-Prasen
On Sun,
Oct 18, 2009 ...
Started 2 months ago (2009-10-17 00:07:00)
by adougher9
BA YORO I can probably help with the Wikipedia processing, as I have custom and
existing perl modules for that. I have converted wikipedia to dictd format
before. If you have server space I can set it up on that, as my own
computer
systems are overstretched. I'm interested in using Wikipedia (and more) to
build a very large Word/Phrase/Acronym definition system, using word sense
induction ...
Started 2 months ago (2009-10-16 21:28:00)
by colloquialdo...
The SVD
algorithm should work for any matrix.
There are two variants -- sparse and partial.
If your matrix has specified values, and the
others are zeroes, use the svd() method. If
the matrix has some known values and the others
are unknown, use the partialSvd() method.
You might have a learning rate that's too high
for the size/density of the problem. Have you tried
different (...
Started 2 months ago (2009-10-13 17:48:00)
by colloquialdo...
prasenjit mukherjee wrote:
> Is there a utility class to generate a
random matrix ( given
> dimensions m,n ) in lingpipe ?
No, but it's really easy. If you want to populate
an M x N matrix with a random double between 0 and 1:
int M = 5; // rows
int N = 7; // columns
Random random = new Random();
Matrix m = new DenseMatrix(M,N);
for (int m = 0; m < M; ++m)
____for (int n = 0; n < N;...