San Francisco, October 25 : The Common Crawl Foundation, a non-profit established in 2007 to make an accessible copy of the internet for public use, has announced a strategic partnership with Constellation Network, a Web3 blockchain ecosystem known for its solutions to the U.S. Department of Defense. This collaboration aims to enhance the accessibility and utility of web-crawled data on blockchain technology for AI and data applications. The initiative will leverage Common Crawl’s extensive dataset, which spans over 250 billion web pages, to support large language models widely used in AI, with Constellation’s decentralized Hypergraph network adding data immutability, provenance, and auditability for transparent AI solutions.
With the AI industry projected to reach $3 trillion by 2030, there is a rising need for secure data sharing for training language models, efficient data storage, monetization of data, and transparency in data sourcing. Through Constellation’s innovative integration of decentralized networks with traditional infrastructure, and Common Crawl’s deep expertise in data, the partnership is set to democratize data access further.
“This partnership marks a major step forward in securing trusted distribution of Common Crawl,” said Rich Skrenta, Executive Director of the Common Crawl Foundation. “By combining our web archive with Constellation’s blockchain expertise, global researchers and developers can authenticate large open datasets, such as those for AI training.”
Ben Jorgensen, CEO of Constellation Network, added, “This partnership with Common Crawl showcases web3’s value beyond cryptocurrency, aligning with our mission to provide a zero-trust network for a data-driven future. We aim to attract developers by demonstrating the advantages of immutability in digital workflows, setting ourselves apart from earlier blockchain generations.”
The partnership will begin with a phased approach, starting with a customizable subnet, or “metagraph,” which will incorporate a portion of Common Crawl’s data. This subnet is already live on Constellation’s test network and will soon be deployed on Hypergraph. Additional details about the live metagraph and ways for organizations and developers to participate will be announced in the coming weeks.
For more information, please visit:
- Common Crawl Foundation: https://commoncrawl.org
- Constellation Network: https://constellationnetwork.io
- X.com @Conste11ation