Home » Java » Storing and retrieving Json object to/from lucene indexes

Storing and retrieving Json object to/from lucene indexes

Posted by: admin December 28, 2021 Leave a comment

Questions:

I have store a set of json object into the lucene indexes and also want to retrieve it from the index. I am using lucene-3.4.

So is there any library or easy mechanism to make this happen in lucene.

For sample: Json object

{
    BOOKNAME1: {
        id:1,
        name:"bname1",
        price:"p1"
    },
    BOOKNAME2: {
        id:2,
        name:"bname2",
        price:"p2"
    },
    BOOKNAME3: {
        id:3,
        name:"bname3",
        price:"p3"
    }
}

Any sort of help will be appreciated.
Thanks in advance,

Answers:

I would recommend you to index your json object by:

1) Parse your json file. I usually use json simple.

2) Open an index using IndexWriterConfig

3) Add documents to the index.

4) Commit changes and close the index

5) Run your queries

If you would like to use Lucene Core instead of elasticsearch, I have created a sample project, which gets as an input a file with JSON objects and creates an Index. Also, I have added a test to query the index.

I am using the latest Lucene version (4.8), please have a look here:

http://ignaciosuay.com/getting-started-with-lucene-and-json-indexing/

If you have time, I think it is worth reading “Lucene in Action”.

Hope it helps.

###

If you don’t want to search within the json but only store it, you just need to extract the id, which will hopefully be unique. Then your lucene document would have two fields:

  • the id (indexed, not necessarily stored)
  • the json itself, as it is (only stored)

Once you stored your json in lucene you can retrieve it filtering by id.

On the other hand this is pretty much what elasticsearch does with your documents. You just send some json to it via a REST api. elasticsearch will keep the json as it is and also make it searchable by default. That means you can either retrieve the json by id or search against it, out of the box without having to write any code.

Also, with lucene your documents wouldn’t be available till you commit your documents or reopen the index reader, while elasticsearch adds a handy transaction log to it, so that the GET is always real time.

Also, elasticsearch offers a lot more: a nice distributed infrastructure, faceting, scripting and more. Check it out!