Growing network

Hypherdata empowers over 500 innovative companies to accelerate precision medicine that matter.

500+
Organizations including Pharmaceutical companies
400+
Vetted medical data providers across NA, EMEA, LATAM, APAC
32
Medical areas including oncology, cardiology, neurology and other
Where data meets discovery

Uniting emerging data providers
and Precision Medicine Pioneers for healthcare breakthroughs

Why our network?

Access to diverse and quality retrospective real world data
Strong and growing network across 4 continents
Simplified data procurement process, from request submission to study completion, reducing administrative overhead and inefficiencies
Delivering public-private partnerships
Improved patient care and predictive benefits
Connect with the right study partner

I am here as a

Closed matchmaking platform

Building a network based
on trust and relevance

Building a network based on trust and relevance

Why our platform?

Closed Data Procurement platform for client confidentiality
Clear Terms of Service with Hypherdata and 3rd parties
Support with communication and contracts
Transparent data governance & confidentiality
Verified sources and providers
Use cases

It is a match!

Use case categories

Comparative Effectiveness Data for a Phase 3 Cardiovascular Drug
Efficacy and Safety Data for a Phase 3 Oncology Treatment
Longitudinal Data on Cancer Treatment Progression
Patient Health Outcomes Data for Diabetes Treatment
Patient Journey Mapping
Patient-Reported Outcomes for a Phase 3 Rheumatoid Arthritis Treatment
Post-Market Safety Surveillance
Real-World Effectiveness Data for HEOR
AI for Cancer Detection
Building AI solution
Cardiology ECG
Data Aggregation & Integration 1
Data Aggregation & Integration 2
Data Annotation
Data Cleaning
Data Security & Privacy
Imaging Data
Improve AI Solution
Lung Cancer
Medical recordings
Microbiology
MRI Data
Neurology
Oncology
Oncology with MRI data
Pathology
Standardization / FAIR

Comparative Effectiveness Data for a Phase 3 Cardiovascular Drug

Objective

Obtain real-world comparative effectiveness data for a new Phase 3 cardiovascular drug.

The pharmaceutical company aimed to compare the effectiveness of their new drug with existing treatments for a specific cardiovascular condition. They required data on patient outcomes, medication adherence, incidence of cardiovascular events, and healthcare utilization.

Auction Parameters

The company detailed the specific cardiovascular condition, treatment regimens, and desired patient population size. Data providers, including electronic health record (EHR) vendors, insurance companies, and healthcare organizations, submitted bids to provide comparative data, focusing on data quality, depth of clinical information, and cost.

Efficacy and Safety Data for a Phase 3 Oncology Treatment

Objective

Gather real-world data on the efficacy and safety of a new Phase 3 oncology treatment.

The pharmaceutical company conducted a Phase 3 clinical trial for a new cancer drug and requires additional real-world data to support regulatory submissions and post-market surveillance. They required data on treatment outcomes, adverse events, patient quality of life, and overall survival rates.

Auction Parameters

The company specified the types of cancer being treated, patient demographics, and geographic regions of interest. Data providers, such as cancer registries, health systems, and research networks, submitted bids to supply the necessary data, emphasizing aspects like data comprehensiveness, patient coverage, and integration with clinical trial results.

Longitudinal Data on Cancer Treatment Progression

Objective

Obtain longitudinal data tracking the progression of cancer in patients undergoing a specific treatment regimen.

The pharmaceutical company aimed to understand long-term treatment outcomes, survival rates, quality of life, and recurrence rates in patients with a particular type of cancer. They needed detailed data on treatment protocols, patient follow-up visits, imaging results, biomarkers, and any subsequent therapies.

Auction Parameters

The auction invited bids from data providers capable of supplying extensive, high-quality longitudinal datasets. The company prioritized bids offering data with longer follow-up periods, more comprehensive patient tracking, and better integration with clinical trial data.

Patient Health Outcomes Data for Diabetes Treatment

Objective

Acquire comprehensive datasets on health outcomes for patients using a new diabetes treatment.

The pharmaceutical company seeked real-world evidence on the efficacy, safety, and patient adherence to the new diabetes medication. They required data including patient demographics, comorbidities, treatment regimens, glucose levels, and adverse events.

Auction Parameters

The company specified the desired data attributes, patient population size, geographical regions, and the time frame of data collection. Data providers (e.g., healthcare providers, insurance companies, health tech companies) submitted bids on the contract to supply the required data, competing on aspects such as data quality, coverage, and price.

Patient Journey Mapping

Objective

Understand the patient journey for individuals living with a specific chronic disease, from diagnosis through treatment and long-term management.

The Medical Affairs department needed detailed data on patient experiences, including diagnostic pathways, treatment regimens, adherence patterns, and long-term health outcomes. This data helped to identify gaps in care and opportunities for improving patient support programs.

Auction Parameters

The department specified the chronic disease of interest, key data points along the patient journey, and the required patient population size. Data providers, such as patient advocacy groups, health information exchanges, and digital health platforms, bid to provide the most detailed and accurate patient journey data, highlighting their ability to capture comprehensive longitudinal data and patient insights.

Patient-Reported Outcomes for a Phase 3 Rheumatoid Arthritis Treatment

Objective

Collect patient-reported outcomes (PROs) for a new Phase 3 treatment for rheumatoid arthritis.

The pharmaceutical company needed real-world data on patient-reported outcomes to complement their Phase 3 trial results. This includes data on pain levels, physical functioning, medication side effects, and overall satisfaction with the treatment.

Auction Parameters

The company outlined the desired patient population characteristics, PRO measures, and data collection methods. Data providers, such as patient advocacy groups, digital health platforms, and clinical research organizations, submitted bids to supply the PRO data, highlighting their ability to capture comprehensive and accurate patient-reported information.

Post-Market Safety Surveillance

Objective

Monitor the long-term safety profile of a newly approved drug in a real-world population.

The Medical Affairs department seeked to continuously monitor adverse events and other safety concerns associated with a new drug. They required a comprehensive data from diverse sources, including electronic health records (EHRs), patient registries, and pharmacovigilance databases.

Auction Parameters

The department specified the types of adverse event data needed, the patient population size, and the geographical regions. Data providers submitted their bid to offer the most comprehensive and timely safety data, focusing on aspects like data accuracy, coverage, and integration with existing pharmacovigilance systems.

Real-World Effectiveness Data for HEOR

Objective

Collect data to support health economic outcomes research and demonstrate the real-world effectiveness of a treatment.

The Medical Affairs department required data on treatment effectiveness, patient-reported outcomes, healthcare utilization, and cost-effectiveness. This data will support value-based discussions with payers and healthcare providers.

Auction Parameters

The department outlined the specific therapeutic area, patient demographics, and desired outcome measures. Data providers, such as health economics research organizations, health systems, and insurance companies, submitted bids to supply the required data, emphasizing the robustness of their data, the comprehensiveness of outcomes measured, and cost-effectiveness of their solution.

AI for Cancer Detection

Objective

Data Modality
MRI for cancer detection and corresponding EHR / EMR data with DICOM and EHR textual/structured data

Medical Area
Oncology

Data Diversity requirements

1,000 – 10,000 sample size

At least 1 cycle of <confidential> therapy

5 countries located within the European Union excluding the DACH region

Auction Parameters

The request coming from a leading organisation within the DACH region of Europe was aimed at addressing the company’s objective to train an advanced AI algorithm capable of generating synthetic versions of the liver MR hepatobiliary phase.

Understanding the precise data requirements and the intended impact, Hypherdata supported the company in a few ways:
Leveraged our existing network of hospitals pan-EU to provide a diverse and extensive dataset of liver MRI scans with comprehensive coverage of the hepatobiliary phase. The dataset encompassed a broad range of patient demographics, liver conditions, and imaging protocols, ensuring robust training of the AI algorithm.
Ensured available data is annotated, de-identified and cleaned towards a compliant data exchange
Shortened the search by identifying diverse hospitals from our network, addressing more complex inclusion and exclusion criteria questions and ensuring the first introduction is a useful one to take significant strides ahead.

The hepatobiliary phase is particularly valuable for assessing liver lesions, such as hepatocellular carcinoma (the most common form of liver cancer) and metastases. A successful outcome could revolutionize liver imaging by significantly shortening the examination process while leveraging artificial intelligence to generate the later phases of the MRI and improving overall efficiency.

Building AI solution

Objective

Data Modality
Echocardiogram (ECG) for 45 diagnoses

Medical Area
Cardiology

Data Diversity requirements
10,000 – 100,000 sample size

Minimum 3 geographic locations spanning 3 continents

Auction Parameters

At Hypherdata, we have addressed a leading company’s request for data to build their AI-driven Cardiology Solution. Our network of global data providers, both hospitals and clinicians, have a diverse and extensive dataset of electrocardiograms (ECGs).

With global healthcare data coverage, we can meet the needs of data diversity geographically, and with several other inclusion and exclusion criteria. This diverse data pool with a robust sample size ensured a well-rounded representation of cardiac conditions, considering variations in demographics, lifestyles, and healthcare practices worldwide.

Through our network of service providers for annotation and data cleaning, we ensure that the dataset provided is always of exceptional quality. Accurate and well-annotated data minimizes bias, enhances model interpretability, and boosts the overall reliability of their AI solution.

Once a match has been made, companies are invited to our deal room where companies efficiently navigate the contractual aspects of accessing the data, annotations, and cleaning services. Our pre-established frameworks, templates, and structures facilitate smooth negotiations, allowing them to focus on their core AI development efforts.

Cardiology ECG

Objective

Data Modality
Echocardiogram (ECG) for 45 diagnoses

Medical Area
Cardiology

Data Diversity requirements
10,000 – 100,000 sample size

Minimum 3 geographic locations spanning 3 continents

Auction Parameters

At Hypherdata, we have addressed a leading company’s request for data to build their AI-driven Cardiology Solution. Our network of global data providers, both hospitals and clinicians, have a diverse and extensive dataset of electrocardiograms (ECGs).

With global healthcare data coverage, we can meet the needs of data diversity geographically, and with several other inclusion and exclusion criteria. This diverse data pool with a robust sample size ensured a well-rounded representation of cardiac conditions, considering variations in demographics, lifestyles, and healthcare practices worldwide.

Through our network of service providers for annotation and data cleaning, we ensure that the dataset provided is always of exceptional quality. Accurate and well-annotated data minimizes bias, enhances model interpretability, and boosts the overall reliability of their AI solution.

Once a match has been made, companies are invited to our deal room where companies efficiently navigate the contractual aspects of accessing the data, annotations, and cleaning services. Our pre-established frameworks, templates, and structures facilitate smooth negotiations, allowing them to focus on their core AI development efforts.

Data Aggregation & Integration 1

Objective

Data Modality
Multimodal neuroimaging data for biomarker identification / validation with demographic information

Cleaning and Standardising available data from multiple projects in UK, US and Cuba

Anonymized EHR information for demographic, unaggregated

Medical Area
Neurology

Data Diversity requirements

100,000+

Wide age distribution of the dataset

Globally representative sample

Auction Parameters

Aimed at aiding research around biomarkers from day 115 to age 100 with the potential to support AI solutions in this area, a company had reached out to Hypherdata with a large set of requirements as phase 1 of a multi-phase project.

Understanding the precise data requirements and the intended impact, Hypherdata supported the company in a few ways
Leveraged our existing network of hospitals and partners to provide a diverse and extensive multimodal imaging dataset including but not limited to MRIs, PETs, Clinical trial information and EHR / EMR data
Ensured the available data is de-identified, annotated, and cleaned towards a compliant data exchange
Shortened the search by identifying diverse healthcare providers from our network especially those with digital capabilities and intent to partner research, addressing more complex inclusion and exclusion criteria questions and ensuring the first introduction is a useful one to take significant strides ahead.

Data Aggregation & Integration 2

Objective

Data Modality

Comprehensive Datasets including participant characteristics, screening exam results, diagnostic procedures, lung cancer, and mortality

CT scans and / or Pathology images

Cleaned and annotated for quicker usage

Medical Area

Otorhinolaryngology / Lung Cancer

Data Diversity requirements

North America representing multiple healthcare providers

Auction Parameters

Hypherdata works closely with AI solutions who have delivered solutions and / or products to specific markets, but continuously work to improve the efficacy of their algorithm through a continuous flow of newer and more diverse data. Diversity at different cycles of market expansion imply different modalities which either ensure improvement of the product, or greater innovation in the route to, or type of, solution.

A typical agreement with a hospital can take over 2 years to materialize, with an additional 10-12 months typically required by a developer to clean and annotate the data to align with existing standards.

With Hypherdata’s network, companies who are delivering AI solutions and improving upon such solutions simultaneously benefit from quick introductions to healthcare providers and data aggregators, allowing for accelerated search and outcome. Hypherdata’s network comprises companies offering Data Cleaning and Data Annotation for AI services, and a quick round of introductions later, the process became defined, purpose-oriented and quick, saving time and resources. The team and platform’s competence includes contract management and a secure deal room enabling a more streamlined approach.

Data Annotation

Objective

Data Modality

Comprehensive Datasets including participant characteristics, screening exam results, diagnostic procedures, lung cancer, and mortality

CT scans and / or Pathology images

Cleaned and annotated for quicker usage

Medical Area

Otorhinolaryngology / Lung Cancer

Data Diversity requirements

North America representing multiple healthcare providers

 

Auction Parameters

Hypherdata works closely with AI solutions who have delivered solutions and / or products to specific markets, but continuously work to improve the efficacy of their algorithm through a continuous flow of newer and more diverse data. Diversity at different cycles of market expansion imply different modalities which either ensure improvement of the product, or greater innovation in the route to, or type of, solution.

A typical agreement with a hospital can take over 2 years to materialize, with an additional 10-12 months typically required by a developer to clean and annotate the data to align with existing standards.

With Hypherdata’s network, companies who are delivering AI solutions and improving upon such solutions simultaneously benefit from quick introductions to healthcare providers and data aggregators, allowing for accelerated search and outcome. Hypherdata’s network comprises companies offering Data Cleaning and Data Annotation for AI services, and a quick round of introductions later, the process became defined, purpose-oriented and quick, saving time and resources. The team and platform’s competence includes contract management and a secure deal room enabling a more streamlined approach.

Data Cleaning

Objective

Data Modality

Comprehensive Datasets including participant characteristics, screening exam results, diagnostic procedures, lung cancer, and mortality

CT scans and / or Pathology images

Cleaned and annotated for quicker usage

Medical Area

Otorhinolaryngology / Lung Cancer

Data Diversity requirements

North America representing multiple healthcare providers

 

Auction Parameters

Hypherdata works closely with AI solutions who have delivered solutions and / or products to specific markets, but continuously work to improve the efficacy of their algorithm through a continuous flow of newer and more diverse data. Diversity at different cycles of market expansion imply different modalities which either ensure improvement of the product, or greater innovation in the route to, or type of, solution.

A typical agreement with a hospital can take over 2 years to materialize, with an additional 10-12 months typically required by a developer to clean and annotate the data to align with existing standards.

With Hypherdata’s network, companies who are delivering AI solutions and improving upon such solutions simultaneously benefit from quick introductions to healthcare providers and data aggregators, allowing for accelerated search and outcome. Hypherdata’s network comprises companies offering Data Cleaning and Data Annotation for AI services, and a quick round of introductions later, the process became defined, purpose-oriented and quick, saving time and resources. The team and platform’s competence includes contract management and a secure deal room enabling a more streamlined approach.

Data Security & Privacy

Objective

A mid-sized hospital is entering into a collaboration with a technology company specialized in providing medical recommendation systems in cancer research. Using AI, a new system is being developed to help the oncology department better/faster predict and diagnose breast cancer cases. The existing collection of medical data, collected and maintained by the hospital IT department, will be used to train this new AI system.

The Hospital’s management and IT department are looking for a sound strategy and roadmap to identify potential risks surrounding the data’s usage by the AI company, map out the technical project from contracts to continuous transfer of data for AI refinement and input of AI services for predictive outcomes. Both organisations also wish to address compliance with national and international data privacy and security regulations.

Auction Parameters

At Hypherdata, we have access to in-depth expertise on the constraints and conditions of using medical data when applying AI and Machine Learning within healthcare systems. Fuelled by the broad range of solution partners we have, our clients get access to new alternative solutions that can minimize the risks and costs when solving new challenges.

For instance, in the case of data privacy, our customers not only find the correct answer but also get access and insight into other better solutions such as:
– Applying Synthetic data to eliminate the risk of data privacy violations
– New techniques to more efficiently apply de-identification of data and data augmentation
– Applying innovative data architectures where AI algorithms/models can move to the source of data instead of having to transfer data or duplicate data sourcesBuilding the right data privacy framework is not only about complying with industry and national regulations but a necessary step to build trust, protect reputation, and develop new revenue streams in healthcare.

Imaging Data

Objective

Data Modality
MRI for cancer detection and corresponding EHR / EMR data with DICOM and EHR textual/structured data

Medical Area
Oncology

Data Diversity requirements

1,000 – 10,000 sample size

At least 1 cycle of <confidential> therapy

5 countries located within the European Union excluding the DACH region

Auction Parameters

The request coming from a leading organisation within the DACH region of Europe was aimed at addressing the company’s objective to train an advanced AI algorithm capable of generating synthetic versions of the liver MR hepatobiliary phase.

Understanding the precise data requirements and the intended impact, Hypherdata supported the company in a few ways:
Leveraged our existing network of hospitals pan-EU to provide a diverse and extensive dataset of liver MRI scans with comprehensive coverage of the hepatobiliary phase. The dataset encompassed a broad range of patient demographics, liver conditions, and imaging protocols, ensuring robust training of the AI algorithm.
Ensured available data is annotated, de-identified and cleaned towards a compliant data exchange
Shortened the search by identifying diverse hospitals from our network, addressing more complex inclusion and exclusion criteria questions and ensuring the first introduction is a useful one to take significant strides ahead.

The hepatobiliary phase is particularly valuable for assessing liver lesions, such as hepatocellular carcinoma (the most common form of liver cancer) and metastases. A successful outcome could revolutionize liver imaging by significantly shortening the examination process while leveraging artificial intelligence to generate the later phases of the MRI and improving overall efficiency.

Improve AI Solution

Objective

Data Modality

Comprehensive Datasets including participant characteristics, screening exam results, diagnostic procedures, lung cancer, and mortality

CT scans and / or Pathology images

Cleaned and annotated for quicker usage

Medical Area

Otorhinolaryngology / Lung Cancer

Data Diversity requirements

North America representing multiple healthcare providers

 

Auction Parameters

Hypherdata works closely with AI solutions who have delivered solutions and / or products to specific markets, but continuously work to improve the efficacy of their algorithm through a continuous flow of newer and more diverse data. Diversity at different cycles of market expansion imply different modalities which either ensure improvement of the product, or greater innovation in the route to, or type of, solution.

A typical agreement with a hospital can take over 2 years to materialize, with an additional 10-12 months typically required by a developer to clean and annotate the data to align with existing standards.

With Hypherdata’s network, companies who are delivering AI solutions and improving upon such solutions simultaneously benefit from quick introductions to healthcare providers and data aggregators, allowing for accelerated search and outcome. Hypherdata’s network comprises companies offering Data Cleaning and Data Annotation for AI services, and a quick round of introductions later, the process became defined, purpose-oriented and quick, saving time and resources. The team and platform’s competence includes contract management and a secure deal room enabling a more streamlined approach.

Lung Cancer

Objective

Data Modality

Comprehensive Datasets including participant characteristics, screening exam results, diagnostic procedures, lung cancer, and mortality

CT scans and / or Pathology images

Cleaned and annotated for quicker usage

Medical Area

Otorhinolaryngology / Lung Cancer

Data Diversity requirements

North America representing multiple healthcare providers

 

Auction Parameters

Hypherdata works closely with AI solutions who have delivered solutions and / or products to specific markets, but continuously work to improve the efficacy of their algorithm through a continuous flow of newer and more diverse data. Diversity at different cycles of market expansion imply different modalities which either ensure improvement of the product, or greater innovation in the route to, or type of, solution.

A typical agreement with a hospital can take over 2 years to materialize, with an additional 10-12 months typically required by a developer to clean and annotate the data to align with existing standards.

With Hypherdata’s network, companies who are delivering AI solutions and improving upon such solutions simultaneously benefit from quick introductions to healthcare providers and data aggregators, allowing for accelerated search and outcome. Hypherdata’s network comprises companies offering Data Cleaning and Data Annotation for AI services, and a quick round of introductions later, the process became defined, purpose-oriented and quick, saving time and resources. The team and platform’s competence includes contract management and a secure deal room enabling a more streamlined approach.

Medical recordings

Objective

Data Modality

Surgery recordings

Medical Area

Spine Injuries

Diversity requirements

1,000 recordings

Global sample set, excluding US and EU since recordings from these regions are already available

Auction Parameters

For a client building robotic guidance systems for spine surgery, designed to help surgeons with navigation and precision during complex spinal procedures, Hypherdata was appointed to identify global sources of medical recordings. Since the first version of the guidance system had been built using data from the U.S. and EU, Hypherdata sub-licensed the client with data originating from 4 countries keeping in mind their exclusion criteria.

Hypherdata’s global network built on willingness and digital preparedness for data and image exchange helps identify the right sources for specific requirements, and enables the simplest and most compliant transfer ecosystem.

Microbiology

Objective

Data modality
EHR and Laboratory reports

Medical area
Microbiology

Data Diversity requirements

Minimum of 4 laboratories spanning 2 continents

Auction Parameters

Working with U.S-based AI companies building algorithms for early detection of Urinary Tract Infections (UTIs), Hypherdata supported their diverse dataset requirement intended to be submitted for FDA approval.

Their AI-powered systems can analyze urine samples to potentially identify the presence of bacteria or abnormal cells indicative of UTIs.

Hypherdata connected them with reliable, global laboratories to obtain de-identified datasets that include relevant microbiology data points, such as the type of specimen, organism identification, antimicrobial susceptibility results, and date of culture. By ensuring the right laboratories are introduced, Hypherdata can shorten the back-and-forth clarification process with early alignment on data modalities and expectations, and allow companies like this one deliver the intended outcome in a shortened, more structured span of time.

The company also received de-identified metadata, such as patient age, gender, and any specific medical history that would contribute to the context of their research.

MRI Data

Objective

Data Modality
Echocardiogram (ECG) for 45 diagnoses

Medical Area
Cardiology

Data Diversity requirements
10,000 – 100,000 sample size

Minimum 3 geographic locations spanning 3 continents

Auction Parameters

At Hypherdata, we have addressed a leading company’s request for data to build their AI-driven Cardiology Solution. Our network of global data providers, both hospitals and clinicians, have a diverse and extensive dataset of electrocardiograms (ECGs).

With global healthcare data coverage, we can meet the needs of data diversity geographically, and with several other inclusion and exclusion criteria. This diverse data pool with a robust sample size ensured a well-rounded representation of cardiac conditions, considering variations in demographics, lifestyles, and healthcare practices worldwide.

Through our network of service providers for annotation and data cleaning, we ensure that the dataset provided is always of exceptional quality. Accurate and well-annotated data minimizes bias, enhances model interpretability, and boosts the overall reliability of their AI solution.

Once a match has been made, companies are invited to our deal room where companies efficiently navigate the contractual aspects of accessing the data, annotations, and cleaning services. Our pre-established frameworks, templates, and structures facilitate smooth negotiations, allowing them to focus on their core AI development efforts.

Neurology

Objective

Data Modality
Multimodal neuroimaging data for biomarker identification / validation with demographic information

Cleaning and Standardising available data from multiple projects in UK, US and Cuba

Anonymized EHR information for demographic, unaggregated

Medical Area
Neurology

Data Diversity requirements

100,000+

Wide age distribution of the dataset

Globally representative sample

Auction Parameters

Aimed at aiding research around biomarkers from day 115 to age 100 with the potential to support AI solutions in this area, a company had reached out to Hypherdata with a large set of requirements as phase 1 of a multi-phase project.

Understanding the precise data requirements and the intended impact, Hypherdata supported the company in a few ways
Leveraged our existing network of hospitals and partners to provide a diverse and extensive multimodal imaging dataset including but not limited to MRIs, PETs, Clinical trial information and EHR / EMR data
Ensured the available data is de-identified, annotated, and cleaned towards a compliant data exchange
Shortened the search by identifying diverse healthcare providers from our network especially those with digital capabilities and intent to partner research, addressing more complex inclusion and exclusion criteria questions and ensuring the first introduction is a useful one to take significant strides ahead.

Oncology

Objective

Data modality
Whole slide images

Medical area
Pathology

Data Diversity requirements

24 diagnoses

200 images per diagnosis

Geographies: Europe, US, Asia

Auction Parameters

We partnered with a U.S. -based company developing AI-powered algorithms for histopathologic assessment of 24 diagnoses to provide objective quantification of disease-markers.

In these cases, data is sourced directly from manufacturers of the microscopes, giving data requestors unprecedented access to longitudinal data from various locations.

Oncology with MRI data

Objective

Data Modality
MRI for cancer detection and corresponding EHR / EMR data with DICOM and EHR textual/structured data

Medical Area
Oncology

Data Diversity requirements

1,000 – 10,000 sample size

At least 1 cycle of <confidential> therapy

5 countries located within the European Union excluding the DACH region

Auction Parameters

The request coming from a leading organisation within the DACH region of Europe was aimed at addressing the company’s objective to train an advanced AI algorithm capable of generating synthetic versions of the liver MR hepatobiliary phase.

Understanding the precise data requirements and the intended impact, Hypherdata supported the company in a few ways:
Leveraged our existing network of hospitals pan-EU to provide a diverse and extensive dataset of liver MRI scans with comprehensive coverage of the hepatobiliary phase. The dataset encompassed a broad range of patient demographics, liver conditions, and imaging protocols, ensuring robust training of the AI algorithm.
Ensured available data is annotated, de-identified and cleaned towards a compliant data exchange
Shortened the search by identifying diverse hospitals from our network, addressing more complex inclusion and exclusion criteria questions and ensuring the first introduction is a useful one to take significant strides ahead.

The hepatobiliary phase is particularly valuable for assessing liver lesions, such as hepatocellular carcinoma (the most common form of liver cancer) and metastases. A successful outcome could revolutionize liver imaging by significantly shortening the examination process while leveraging artificial intelligence to generate the later phases of the MRI and improving overall efficiency.

Pathology

Objective

Data modality
Whole slide images

Medical area
Pathology

Data Diversity requirements

24 diagnoses

200 images per diagnosis

Geographies: Europe, US, Asia

Auction Parameters

We partnered with a U.S. -based company developing AI-powered algorithms for histopathologic assessment of 24 diagnoses to provide objective quantification of disease-markers.

In these cases, data is sourced directly from manufacturers of the microscopes, giving data requestors unprecedented access to longitudinal data from various locations.

Standardization / FAIR

Objective

A pharmaceutical company is setting up a new Machine Learning pipeline for one of its drug discovery projects. Like any typical ML project, the first step is to collect the correct datasets and structure, clean them, and do proper feature mapping.

The data science team has decided to experiment with this project by using two sources of data:

  • In-house datasets
  • public datasets

The goal is to validate the usefulness and trustworthiness of data from public sources. Furthermore, confirm the ability to combine these data sources in the pipeline during the first phase of the project, e.g., data aggregation and structuring.
They need the fastest and safest way to achieve the above, following the standards defined within the organization and simultaneously complying with all that goes with public datasets.

Auction Parameters

At Hypherdata, we recognize the unique challenges faced by pharmaceutical companies in setting up robust Machine Learning (ML) pipelines for drug discovery projects. We tailored a comprehensive solution to address their specific requirements. Our network of experts worked closely with the organization to identify methods and processes to clean, de-identify and annotate both sources of data.

Cleaning implied ensuring data quality, and removing inconsistencies, errors, and redundancies from the healthcare datasets, especially the public one. Annotation involved correctly and efficiently labeling and categorizing medical data to create a high-quality dataset for training their ML model.

Through this support, we ensured the organisation could use the expertly curated In-house datasets, with both in-house and public sets seamlessly integrated. A professional strategy based on the right knowledge will help the organization streamline research, heighten data confidence, and accelerate drug discovery programmes.