Last commit for README.txt: 438ef9ce6d5f97392115d0bd415c808511b22590

Much better

suhit [2005-02-14 18:23:45]
Much better
  1. ============================
  2. README FOR CONTENT EXTRACTOR
  3. ============================
  4.  
  5. 1) Installation
  6. ==================
  7.  
  8. If you are installing from installer -
  9. ------------------------------------
  10.  
  11. To use the content extractor, simply double click on the file Crunch#.exe
  12. and follow the instructions (where # indicates the version of Crunch). It
  13. will help you install Crunch and it will be ready to run (whether or not you
  14. have Java installed on your computer). You can jump down to the Usage section
  15. of this document.
  16.  
  17. If you are downloading from CVS -
  18. -------------------------------
  19.  
  20. After getting the appropriate CVSROOT from one of the Crunch developers
  21. to check out your own copy of Crunch, check out the module psl/crunch
  22. I recommend that you do this in Eclipse since this will also perhaps make
  23. you switch to using Eclipse as your default IDE. :-) Anyways, once you have
  24. checked it out, you can open it as an Eclipse Java project (the .project file
  25. is included). All the source is in the 'src' directory. The package psl.crunch3
  26. is the most important one. The plugin infrastructure is in the psl.crunch3.plugins
  27. which should explain the API that you need to implement in order to create you own
  28. plugins. For your convience, I have created a sample plugin whose implementation
  29. you can find in psl.crunch3.plugins.sample.
  30.  
  31. You are basically ready to run the project. You have the option to run it in
  32. verbose mode. In your runtime arguments, put in a --verbose. Also, if you want
  33. to use our clustering stuff, you should also perhaps give the argument -Xmx768m
  34. which will provide the JVM with a maximum memory of 768MB. (Change the syntax of
  35. this argument depending on your IDE). This much memory will never really be used
  36. but, well, you get the idea. You are ready to use Crunch.
  37.  
  38.  
  39. 2) Usage
  40. ===========
  41.  
  42. The input file and the output file are necessary but the settings file is
  43. optional. The default settings file is settings.txt. In order to change
  44. the settings without the settings GUI provided in the Proxy, the file must be
  45. directly edited. The file is saved using a Java Properties file. See the
  46. Java APIs for the proper format.
  47.  
  48. Once the content extractor starts up, it will start the proxy up on port
  49. 4000. Point your web browser to listen to port 4000 on the localhost if
  50. you run it on your own machine, or on the name of the particular server
  51. that you run it on.
  52.  
  53. In Internet Explorer, this can be done by going to Tools -> Internet Options ->
  54. Connections -> LAN Settings. Check the proxy server box and set the server name to
  55. localhost (if on local machine) or to the server that you are running the proxy on.
  56. If running on local machine, server is localhost. Default port is 4000 unless
  57. otherwise specified. In Mozilla/Firefox, you can click on Tools -> Options. Click
  58. on the Connection settings button which will launch a new window. Change the proxy
  59. settings from Direct Connection to Manual. If running on local machine, server is
  60. localhost. Default port is 4000 unless otherwise specified.
  61.  
  62.  
  63. 3) Contact
  64. =============
  65.  
  66. Send email to suhit@cs.columbia.edu if you have any questions.
  67.