腾讯狂言nba全集:SerialDatastore question

来源:百度文库 编辑:九乡新闻网 时间:2024/10/04 02:53:18
   Hi Andrei,

Firstly, if you want to take advantage of the Serial Datastore memory
saving features, you need to populate your corpus with documents after
you convert it to a serial corpus. The code you included actually keeps
all documents open in memory, which means that the serial DS gives you
no advantage.

What you need to do is:

//create corpus
//add to serial DS and get a pointer to the new serial corpus
//for each document
  //add doc to corpus
  corpus.add(doc);
  //unload doc
  corpus.unloadDocument(doc);
  //free memory
  Factory.deleteResource(doc);
...
//then process the corpus with the application.

However, this only makes sense if you need to keep the resulting
documents for later. If all you need is to process each document,
extract some data from it and then discard it then you're better off
processing documents one by one and not using a serial DS at all.

Also, are you changing the memory allocation for your tomcat instance?
By default Java applications get only 64M of RAM, which is insufficient
for running GATE. You need to increase that to at least 200M, by using
the -Xmx200m option when calling the java command. You can do this by
editing the tomcat start-up script (I think there's an environment
variable called CATALINA_OPTS which you can use for that).

     HTH,
    Valentin