Kaggle challenge - patent text analysis

The current Kaggle challenge (typically for machine learning … )

Can you extract meaning from a large, text-based dataset derived from inventions? Here’s your chance to do so.

In this competition, you will train your models on a novel semantic similarity dataset to extract relevant information by matching key phrases in patent documents. For example, if one invention claims “television set” and a prior publication describes “TV set”, a model would ideally recognize these are the same. The best solutions will extend beyond paraphrase identification and use the technical domain context to assist a patent attorney or examiner in retrieving relevant documents.

Total Prizes:


Entry Deadline:

June 13, 2022
Join This Competition
Help the patent community connect the dots between millions of patent documents with your phrase-matching model.

Good luck,

Will Cukierski
Kaggle Data Scientist

1 Like

Multi-million dollar* solution wins prize of $25,000!!


The global patent analytics market is expected to grow from USD 817.0 million in 2021 to USD 1,859.6 million in 2028 at a CAGR of 12.5%

I hope if someone manages to do this they consider the long-term earning potential before giving it away :wink:

Sounds like what inteng.com.au are working on / advertising a role for : https://forums.adug.org.au/t/delphi-analyst-programmer-database-wfh-part-time/59221