Can you extract meaning from a large, text-based dataset derived from inventions? Here’s your chance to do so.

In this competition, you will train your models on a novel semantic similarity dataset to extract relevant information by matching key phrases in patent documents. For example, if one invention claims “television set” and a prior publication describes “TV set”, a model would ideally recognize these are the same. The best solutions will extend beyond paraphrase identification and use the technical domain context to assist a patent attorney or examiner in retrieving relevant documents.

Help the patent community connect the dots between millions of patent documents with your phrase-matching model.

Multi-million dollar* solution wins prize of $25,000!!


The global patent analytics market is expected to grow from USD 817.0 million in 2021 to USD 1,859.6 million in 2028 at a CAGR of 12.5%

I hope if someone manages to do this they consider the long-term earning potential before giving it away :wink:

