Read how it worked for others
Use case categories
Comparative Effectiveness Data for a Phase 3 Cardiovascular Drug
Objective
Obtain real-world comparative effectiveness data for a new Phase 3 cardiovascular drug.
The pharmaceutical company aimed to compare the effectiveness of their new drug with existing treatments for a specific cardiovascular condition. They required data on patient outcomes, medication adherence, incidence of cardiovascular events, and healthcare utilization.
Auction Parameters
The company detailed the specific cardiovascular condition, treatment regimens, and desired patient population size. Data providers, including electronic health record (EHR) vendors, insurance companies, and healthcare organizations, submitted bids to provide comparative data, focusing on data quality, depth of clinical information, and cost.
Efficacy and Safety Data for a Phase 3 Oncology Treatment
Objective
Gather real-world data on the efficacy and safety of a new Phase 3 oncology treatment.
The pharmaceutical company conducted a Phase 3 clinical trial for a new cancer drug and requires additional real-world data to support regulatory submissions and post-market surveillance. They required data on treatment outcomes, adverse events, patient quality of life, and overall survival rates.
Auction Parameters
The company specified the types of cancer being treated, patient demographics, and geographic regions of interest. Data providers, such as cancer registries, health systems, and research networks, submitted bids to supply the necessary data, emphasizing aspects like data comprehensiveness, patient coverage, and integration with clinical trial results.
Longitudinal Data on Cancer Treatment Progression
Objective
Obtain longitudinal data tracking the progression of cancer in patients undergoing a specific treatment regimen.
The pharmaceutical company aimed to understand long-term treatment outcomes, survival rates, quality of life, and recurrence rates in patients with a particular type of cancer. They needed detailed data on treatment protocols, patient follow-up visits, imaging results, biomarkers, and any subsequent therapies.
Auction Parameters
The auction invited bids from data providers capable of supplying extensive, high-quality longitudinal datasets. The company prioritized bids offering data with longer follow-up periods, more comprehensive patient tracking, and better integration with clinical trial data.
Patient Health Outcomes Data for Diabetes Treatment
Objective
Acquire comprehensive datasets on health outcomes for patients using a new diabetes treatment.
The pharmaceutical company seeked real-world evidence on the efficacy, safety, and patient adherence to the new diabetes medication. They required data including patient demographics, comorbidities, treatment regimens, glucose levels, and adverse events.
Auction Parameters
The company specified the desired data attributes, patient population size, geographical regions, and the time frame of data collection. Data providers (e.g., healthcare providers, insurance companies, health tech companies) submitted bids on the contract to supply the required data, competing on aspects such as data quality, coverage, and price.
Patient Journey Mapping
Objective
Understand the patient journey for individuals living with a specific chronic disease, from diagnosis through treatment and long-term management.
The Medical Affairs department needed detailed data on patient experiences, including diagnostic pathways, treatment regimens, adherence patterns, and long-term health outcomes. This data helped to identify gaps in care and opportunities for improving patient support programs.
Auction Parameters
The department specified the chronic disease of interest, key data points along the patient journey, and the required patient population size. Data providers, such as patient advocacy groups, health information exchanges, and digital health platforms, bid to provide the most detailed and accurate patient journey data, highlighting their ability to capture comprehensive longitudinal data and patient insights.
Patient-Reported Outcomes for a Phase 3 Rheumatoid Arthritis Treatment
Objective
Collect patient-reported outcomes (PROs) for a new Phase 3 treatment for rheumatoid arthritis.
The pharmaceutical company needed real-world data on patient-reported outcomes to complement their Phase 3 trial results. This includes data on pain levels, physical functioning, medication side effects, and overall satisfaction with the treatment.
Auction Parameters
The company outlined the desired patient population characteristics, PRO measures, and data collection methods. Data providers, such as patient advocacy groups, digital health platforms, and clinical research organizations, submitted bids to supply the PRO data, highlighting their ability to capture comprehensive and accurate patient-reported information.
Post-Market Safety Surveillance
Objective
Monitor the long-term safety profile of a newly approved drug in a real-world population.
The Medical Affairs department seeked to continuously monitor adverse events and other safety concerns associated with a new drug. They required a comprehensive data from diverse sources, including electronic health records (EHRs), patient registries, and pharmacovigilance databases.
Auction Parameters
The department specified the types of adverse event data needed, the patient population size, and the geographical regions. Data providers submitted their bid to offer the most comprehensive and timely safety data, focusing on aspects like data accuracy, coverage, and integration with existing pharmacovigilance systems.
Real-World Effectiveness Data for HEOR
Objective
Collect data to support health economic outcomes research and demonstrate the real-world effectiveness of a treatment.
The Medical Affairs department required data on treatment effectiveness, patient-reported outcomes, healthcare utilization, and cost-effectiveness. This data will support value-based discussions with payers and healthcare providers.
Auction Parameters
The department outlined the specific therapeutic area, patient demographics, and desired outcome measures. Data providers, such as health economics research organizations, health systems, and insurance companies, submitted bids to supply the required data, emphasizing the robustness of their data, the comprehensiveness of outcomes measured, and cost-effectiveness of their solution.
AI for Cancer Detection
Objective
Data Modality
MRI for cancer detection and corresponding EHR / EMR data with DICOM and EHR textual/structured data
Medical Area
Oncology
Data Diversity requirements
1,000 – 10,000 sample size
At least 1 cycle of <confidential> therapy
5 countries located within the European Union excluding the DACH region
Auction Parameters
The request coming from a leading organisation within the DACH region of Europe was aimed at addressing the company’s objective to train an advanced AI algorithm capable of generating synthetic versions of the liver MR hepatobiliary phase.
Understanding the precise data requirements and the intended impact, Hypherdata supported the company in a few ways:
– Leveraged our existing network of hospitals pan-EU to provide a diverse and extensive dataset of liver MRI scans with comprehensive coverage of the hepatobiliary phase. The dataset encompassed a broad range of patient demographics, liver conditions, and imaging protocols, ensuring robust training of the AI algorithm.
– Ensured available data is annotated, de-identified and cleaned towards a compliant data exchange
– Shortened the search by identifying diverse hospitals from our network, addressing more complex inclusion and exclusion criteria questions and ensuring the first introduction is a useful one to take significant strides ahead.
The hepatobiliary phase is particularly valuable for assessing liver lesions, such as hepatocellular carcinoma (the most common form of liver cancer) and metastases. A successful outcome could revolutionize liver imaging by significantly shortening the examination process while leveraging artificial intelligence to generate the later phases of the MRI and improving overall efficiency.
Building AI solution
Objective
Data Modality
Echocardiogram (ECG) for 45 diagnoses
Medical Area
Cardiology
Data Diversity requirements
10,000 – 100,000 sample size
Minimum 3 geographic locations spanning 3 continents
Auction Parameters
At Hypherdata, we have addressed a leading company’s request for data to build their AI-driven Cardiology Solution. Our network of global data providers, both hospitals and clinicians, have a diverse and extensive dataset of electrocardiograms (ECGs).
With global healthcare data coverage, we can meet the needs of data diversity geographically, and with several other inclusion and exclusion criteria. This diverse data pool with a robust sample size ensured a well-rounded representation of cardiac conditions, considering variations in demographics, lifestyles, and healthcare practices worldwide.
Through our network of service providers for annotation and data cleaning, we ensure that the dataset provided is always of exceptional quality. Accurate and well-annotated data minimizes bias, enhances model interpretability, and boosts the overall reliability of their AI solution.
Once a match has been made, companies are invited to our deal room where companies efficiently navigate the contractual aspects of accessing the data, annotations, and cleaning services. Our pre-established frameworks, templates, and structures facilitate smooth negotiations, allowing them to focus on their core AI development efforts.
Cardiology ECG
Objective
Data Modality
Echocardiogram (ECG) for 45 diagnoses
Medical Area
Cardiology
Data Diversity requirements
10,000 – 100,000 sample size
Minimum 3 geographic locations spanning 3 continents
Auction Parameters
At Hypherdata, we have addressed a leading company’s request for data to build their AI-driven Cardiology Solution. Our network of global data providers, both hospitals and clinicians, have a diverse and extensive dataset of electrocardiograms (ECGs).
With global healthcare data coverage, we can meet the needs of data diversity geographically, and with several other inclusion and exclusion criteria. This diverse data pool with a robust sample size ensured a well-rounded representation of cardiac conditions, considering variations in demographics, lifestyles, and healthcare practices worldwide.
Through our network of service providers for annotation and data cleaning, we ensure that the dataset provided is always of exceptional quality. Accurate and well-annotated data minimizes bias, enhances model interpretability, and boosts the overall reliability of their AI solution.
Once a match has been made, companies are invited to our deal room where companies efficiently navigate the contractual aspects of accessing the data, annotations, and cleaning services. Our pre-established frameworks, templates, and structures facilitate smooth negotiations, allowing them to focus on their core AI development efforts.
Data Aggregation & Integration 1
Objective
Data Modality
Multimodal neuroimaging data for biomarker identification / validation with demographic information
Cleaning and Standardising available data from multiple projects in UK, US and Cuba
Anonymized EHR information for demographic, unaggregated
Medical Area
Neurology
Data Diversity requirements
100,000+
Wide age distribution of the dataset
Globally representative sample
Auction Parameters
Aimed at aiding research around biomarkers from day 115 to age 100 with the potential to support AI solutions in this area, a company had reached out to Hypherdata with a large set of requirements as phase 1 of a multi-phase project.
Understanding the precise data requirements and the intended impact, Hypherdata supported the company in a few ways
– Leveraged our existing network of hospitals and partners to provide a diverse and extensive multimodal imaging dataset including but not limited to MRIs, PETs, Clinical trial information and EHR / EMR data
– Ensured the available data is de-identified, annotated, and cleaned towards a compliant data exchange
– Shortened the search by identifying diverse healthcare providers from our network especially those with digital capabilities and intent to partner research, addressing more complex inclusion and exclusion criteria questions and ensuring the first introduction is a useful one to take significant strides ahead.
Data Aggregation & Integration 2
Objective
Data Modality
Comprehensive Datasets including participant characteristics, screening exam results, diagnostic procedures, lung cancer, and mortality
CT scans and / or Pathology images
Cleaned and annotated for quicker usage
Medical Area
Otorhinolaryngology / Lung Cancer
Data Diversity requirements
North America representing multiple healthcare providers
Auction Parameters
Hypherdata works closely with AI solutions who have delivered solutions and / or products to specific markets, but continuously work to improve the efficacy of their algorithm through a continuous flow of newer and more diverse data. Diversity at different cycles of market expansion imply different modalities which either ensure improvement of the product, or greater innovation in the route to, or type of, solution.
A typical agreement with a hospital can take over 2 years to materialize, with an additional 10-12 months typically required by a developer to clean and annotate the data to align with existing standards.
With Hypherdata’s network, companies who are delivering AI solutions and improving upon such solutions simultaneously benefit from quick introductions to healthcare providers and data aggregators, allowing for accelerated search and outcome. Hypherdata’s network comprises companies offering Data Cleaning and Data Annotation for AI services, and a quick round of introductions later, the process became defined, purpose-oriented and quick, saving time and resources. The team and platform’s competence includes contract management and a secure deal room enabling a more streamlined approach.
Data Annotation
Objective
Data Modality
Comprehensive Datasets including participant characteristics, screening exam results, diagnostic procedures, lung cancer, and mortality
CT scans and / or Pathology images
Cleaned and annotated for quicker usage
Medical Area
Otorhinolaryngology / Lung Cancer
Data Diversity requirements
North America representing multiple healthcare providers
Auction Parameters
Hypherdata works closely with AI solutions who have delivered solutions and / or products to specific markets, but continuously work to improve the efficacy of their algorithm through a continuous flow of newer and more diverse data. Diversity at different cycles of market expansion imply different modalities which either ensure improvement of the product, or greater innovation in the route to, or type of, solution.
A typical agreement with a hospital can take over 2 years to materialize, with an additional 10-12 months typically required by a developer to clean and annotate the data to align with existing standards.
With Hypherdata’s network, companies who are delivering AI solutions and improving upon such solutions simultaneously benefit from quick introductions to healthcare providers and data aggregators, allowing for accelerated search and outcome. Hypherdata’s network comprises companies offering Data Cleaning and Data Annotation for AI services, and a quick round of introductions later, the process became defined, purpose-oriented and quick, saving time and resources. The team and platform’s competence includes contract management and a secure deal room enabling a more streamlined approach.
Data Cleaning
Objective
Data Modality
Comprehensive Datasets including participant characteristics, screening exam results, diagnostic procedures, lung cancer, and mortality
CT scans and / or Pathology images
Cleaned and annotated for quicker usage
Medical Area
Otorhinolaryngology / Lung Cancer
Data Diversity requirements
North America representing multiple healthcare providers
Auction Parameters
Hypherdata works closely with AI solutions who have delivered solutions and / or products to specific markets, but continuously work to improve the efficacy of their algorithm through a continuous flow of newer and more diverse data. Diversity at different cycles of market expansion imply different modalities which either ensure improvement of the product, or greater innovation in the route to, or type of, solution.
A typical agreement with a hospital can take over 2 years to materialize, with an additional 10-12 months typically required by a developer to clean and annotate the data to align with existing standards.
With Hypherdata’s network, companies who are delivering AI solutions and improving upon such solutions simultaneously benefit from quick introductions to healthcare providers and data aggregators, allowing for accelerated search and outcome. Hypherdata’s network comprises companies offering Data Cleaning and Data Annotation for AI services, and a quick round of introductions later, the process became defined, purpose-oriented and quick, saving time and resources. The team and platform’s competence includes contract management and a secure deal room enabling a more streamlined approach.
Data Security & Privacy
Objective
A mid-sized hospital is entering into a collaboration with a technology company specialized in providing medical recommendation systems in cancer research. Using AI, a new system is being developed to help the oncology department better/faster predict and diagnose breast cancer cases. The existing collection of medical data, collected and maintained by the hospital IT department, will be used to train this new AI system.
The Hospital’s management and IT department are looking for a sound strategy and roadmap to identify potential risks surrounding the data’s usage by the AI company, map out the technical project from contracts to continuous transfer of data for AI refinement and input of AI services for predictive outcomes. Both organisations also wish to address compliance with national and international data privacy and security regulations.
Auction Parameters
At Hypherdata, we have access to in-depth expertise on the constraints and conditions of using medical data when applying AI and Machine Learning within healthcare systems. Fuelled by the broad range of solution partners we have, our clients get access to new alternative solutions that can minimize the risks and costs when solving new challenges.
For instance, in the case of data privacy, our customers not only find the correct answer but also get access and insight into other better solutions such as:
– Applying Synthetic data to eliminate the risk of data privacy violations
– New techniques to more efficiently apply de-identification of data and data augmentation
– Applying innovative data architectures where AI algorithms/models can move to the source of data instead of having to transfer data or duplicate data sourcesBuilding the right data privacy framework is not only about complying with industry and national regulations but a necessary step to build trust, protect reputation, and develop new revenue streams in healthcare.
Imaging Data
Objective
Data Modality
MRI for cancer detection and corresponding EHR / EMR data with DICOM and EHR textual/structured data
Medical Area
Oncology
Data Diversity requirements
1,000 – 10,000 sample size
At least 1 cycle of <confidential> therapy
5 countries located within the European Union excluding the DACH region
Auction Parameters
The request coming from a leading organisation within the DACH region of Europe was aimed at addressing the company’s objective to train an advanced AI algorithm capable of generating synthetic versions of the liver MR hepatobiliary phase.
Understanding the precise data requirements and the intended impact, Hypherdata supported the company in a few ways:
– Leveraged our existing network of hospitals pan-EU to provide a diverse and extensive dataset of liver MRI scans with comprehensive coverage of the hepatobiliary phase. The dataset encompassed a broad range of patient demographics, liver conditions, and imaging protocols, ensuring robust training of the AI algorithm.
– Ensured available data is annotated, de-identified and cleaned towards a compliant data exchange
– Shortened the search by identifying diverse hospitals from our network, addressing more complex inclusion and exclusion criteria questions and ensuring the first introduction is a useful one to take significant strides ahead.
The hepatobiliary phase is particularly valuable for assessing liver lesions, such as hepatocellular carcinoma (the most common form of liver cancer) and metastases. A successful outcome could revolutionize liver imaging by significantly shortening the examination process while leveraging artificial intelligence to generate the later phases of the MRI and improving overall efficiency.
Improve AI Solution
Objective
Data Modality
Comprehensive Datasets including participant characteristics, screening exam results, diagnostic procedures, lung cancer, and mortality
CT scans and / or Pathology images
Cleaned and annotated for quicker usage
Medical Area
Otorhinolaryngology / Lung Cancer
Data Diversity requirements
North America representing multiple healthcare providers
Auction Parameters
Hypherdata works closely with AI solutions who have delivered solutions and / or products to specific markets, but continuously work to improve the efficacy of their algorithm through a continuous flow of newer and more diverse data. Diversity at different cycles of market expansion imply different modalities which either ensure improvement of the product, or greater innovation in the route to, or type of, solution.
A typical agreement with a hospital can take over 2 years to materialize, with an additional 10-12 months typically required by a developer to clean and annotate the data to align with existing standards.
With Hypherdata’s network, companies who are delivering AI solutions and improving upon such solutions simultaneously benefit from quick introductions to healthcare providers and data aggregators, allowing for accelerated search and outcome. Hypherdata’s network comprises companies offering Data Cleaning and Data Annotation for AI services, and a quick round of introductions later, the process became defined, purpose-oriented and quick, saving time and resources. The team and platform’s competence includes contract management and a secure deal room enabling a more streamlined approach.
Lung Cancer
Objective
Data Modality
Comprehensive Datasets including participant characteristics, screening exam results, diagnostic procedures, lung cancer, and mortality
CT scans and / or Pathology images
Cleaned and annotated for quicker usage
Medical Area
Otorhinolaryngology / Lung Cancer
Data Diversity requirements
North America representing multiple healthcare providers
Auction Parameters
Hypherdata works closely with AI solutions who have delivered solutions and / or products to specific markets, but continuously work to improve the efficacy of their algorithm through a continuous flow of newer and more diverse data. Diversity at different cycles of market expansion imply different modalities which either ensure improvement of the product, or greater innovation in the route to, or type of, solution.
A typical agreement with a hospital can take over 2 years to materialize, with an additional 10-12 months typically required by a developer to clean and annotate the data to align with existing standards.
With Hypherdata’s network, companies who are delivering AI solutions and improving upon such solutions simultaneously benefit from quick introductions to healthcare providers and data aggregators, allowing for accelerated search and outcome. Hypherdata’s network comprises companies offering Data Cleaning and Data Annotation for AI services, and a quick round of introductions later, the process became defined, purpose-oriented and quick, saving time and resources. The team and platform’s competence includes contract management and a secure deal room enabling a more streamlined approach.
Medical recordings
Objective
Data Modality
Surgery recordings
Medical Area
Spine Injuries
Diversity requirements
1,000 recordings
Global sample set, excluding US and EU since recordings from these regions are already available
Auction Parameters
For a client building robotic guidance systems for spine surgery, designed to help surgeons with navigation and precision during complex spinal procedures, Hypherdata was appointed to identify global sources of medical recordings. Since the first version of the guidance system had been built using data from the U.S. and EU, Hypherdata sub-licensed the client with data originating from 4 countries keeping in mind their exclusion criteria.
Hypherdata’s global network built on willingness and digital preparedness for data and image exchange helps identify the right sources for specific requirements, and enables the simplest and most compliant transfer ecosystem.
Microbiology
Objective
Data modality
EHR and Laboratory reports
Medical area
Microbiology
Data Diversity requirements
Minimum of 4 laboratories spanning 2 continents
Auction Parameters
Working with U.S-based AI companies building algorithms for early detection of Urinary Tract Infections (UTIs), Hypherdata supported their diverse dataset requirement intended to be submitted for FDA approval.
Their AI-powered systems can analyze urine samples to potentially identify the presence of bacteria or abnormal cells indicative of UTIs.
Hypherdata connected them with reliable, global laboratories to obtain de-identified datasets that include relevant microbiology data points, such as the type of specimen, organism identification, antimicrobial susceptibility results, and date of culture. By ensuring the right laboratories are introduced, Hypherdata can shorten the back-and-forth clarification process with early alignment on data modalities and expectations, and allow companies like this one deliver the intended outcome in a shortened, more structured span of time.
The company also received de-identified metadata, such as patient age, gender, and any specific medical history that would contribute to the context of their research.
MRI Data
Objective
Data Modality
Echocardiogram (ECG) for 45 diagnoses
Medical Area
Cardiology
Data Diversity requirements
10,000 – 100,000 sample size
Minimum 3 geographic locations spanning 3 continents
Auction Parameters
At Hypherdata, we have addressed a leading company’s request for data to build their AI-driven Cardiology Solution. Our network of global data providers, both hospitals and clinicians, have a diverse and extensive dataset of electrocardiograms (ECGs).
With global healthcare data coverage, we can meet the needs of data diversity geographically, and with several other inclusion and exclusion criteria. This diverse data pool with a robust sample size ensured a well-rounded representation of cardiac conditions, considering variations in demographics, lifestyles, and healthcare practices worldwide.
Through our network of service providers for annotation and data cleaning, we ensure that the dataset provided is always of exceptional quality. Accurate and well-annotated data minimizes bias, enhances model interpretability, and boosts the overall reliability of their AI solution.
Once a match has been made, companies are invited to our deal room where companies efficiently navigate the contractual aspects of accessing the data, annotations, and cleaning services. Our pre-established frameworks, templates, and structures facilitate smooth negotiations, allowing them to focus on their core AI development efforts.
Neurology
Objective
Data Modality
Multimodal neuroimaging data for biomarker identification / validation with demographic information
Cleaning and Standardising available data from multiple projects in UK, US and Cuba
Anonymized EHR information for demographic, unaggregated
Medical Area
Neurology
Data Diversity requirements
100,000+
Wide age distribution of the dataset
Globally representative sample
Auction Parameters
Aimed at aiding research around biomarkers from day 115 to age 100 with the potential to support AI solutions in this area, a company had reached out to Hypherdata with a large set of requirements as phase 1 of a multi-phase project.
Understanding the precise data requirements and the intended impact, Hypherdata supported the company in a few ways
– Leveraged our existing network of hospitals and partners to provide a diverse and extensive multimodal imaging dataset including but not limited to MRIs, PETs, Clinical trial information and EHR / EMR data
– Ensured the available data is de-identified, annotated, and cleaned towards a compliant data exchange
– Shortened the search by identifying diverse healthcare providers from our network especially those with digital capabilities and intent to partner research, addressing more complex inclusion and exclusion criteria questions and ensuring the first introduction is a useful one to take significant strides ahead.
Oncology
Objective
Data modality
Whole slide images
Medical area
Pathology
Data Diversity requirements
24 diagnoses
200 images per diagnosis
Geographies: Europe, US, Asia
Auction Parameters
We partnered with a U.S. -based company developing AI-powered algorithms for histopathologic assessment of 24 diagnoses to provide objective quantification of disease-markers.
In these cases, data is sourced directly from manufacturers of the microscopes, giving data requestors unprecedented access to longitudinal data from various locations.
Oncology with MRI data
Objective
Data Modality
MRI for cancer detection and corresponding EHR / EMR data with DICOM and EHR textual/structured data
Medical Area
Oncology
Data Diversity requirements
1,000 – 10,000 sample size
At least 1 cycle of <confidential> therapy
5 countries located within the European Union excluding the DACH region
Auction Parameters
The request coming from a leading organisation within the DACH region of Europe was aimed at addressing the company’s objective to train an advanced AI algorithm capable of generating synthetic versions of the liver MR hepatobiliary phase.
Understanding the precise data requirements and the intended impact, Hypherdata supported the company in a few ways:
– Leveraged our existing network of hospitals pan-EU to provide a diverse and extensive dataset of liver MRI scans with comprehensive coverage of the hepatobiliary phase. The dataset encompassed a broad range of patient demographics, liver conditions, and imaging protocols, ensuring robust training of the AI algorithm.
– Ensured available data is annotated, de-identified and cleaned towards a compliant data exchange
– Shortened the search by identifying diverse hospitals from our network, addressing more complex inclusion and exclusion criteria questions and ensuring the first introduction is a useful one to take significant strides ahead.
The hepatobiliary phase is particularly valuable for assessing liver lesions, such as hepatocellular carcinoma (the most common form of liver cancer) and metastases. A successful outcome could revolutionize liver imaging by significantly shortening the examination process while leveraging artificial intelligence to generate the later phases of the MRI and improving overall efficiency.
Pathology
Objective
Data modality
Whole slide images
Medical area
Pathology
Data Diversity requirements
24 diagnoses
200 images per diagnosis
Geographies: Europe, US, Asia
Auction Parameters
We partnered with a U.S. -based company developing AI-powered algorithms for histopathologic assessment of 24 diagnoses to provide objective quantification of disease-markers.
In these cases, data is sourced directly from manufacturers of the microscopes, giving data requestors unprecedented access to longitudinal data from various locations.
Standardization / FAIR
Objective
A pharmaceutical company is setting up a new Machine Learning pipeline for one of its drug discovery projects. Like any typical ML project, the first step is to collect the correct datasets and structure, clean them, and do proper feature mapping.
The data science team has decided to experiment with this project by using two sources of data:
- In-house datasets
- public datasets
The goal is to validate the usefulness and trustworthiness of data from public sources. Furthermore, confirm the ability to combine these data sources in the pipeline during the first phase of the project, e.g., data aggregation and structuring.
They need the fastest and safest way to achieve the above, following the standards defined within the organization and simultaneously complying with all that goes with public datasets.
Auction Parameters
At Hypherdata, we recognize the unique challenges faced by pharmaceutical companies in setting up robust Machine Learning (ML) pipelines for drug discovery projects. We tailored a comprehensive solution to address their specific requirements. Our network of experts worked closely with the organization to identify methods and processes to clean, de-identify and annotate both sources of data.
Cleaning implied ensuring data quality, and removing inconsistencies, errors, and redundancies from the healthcare datasets, especially the public one. Annotation involved correctly and efficiently labeling and categorizing medical data to create a high-quality dataset for training their ML model.
Through this support, we ensured the organisation could use the expertly curated In-house datasets, with both in-house and public sets seamlessly integrated. A professional strategy based on the right knowledge will help the organization streamline research, heighten data confidence, and accelerate drug discovery programmes.
Auction Listing is Free
Hypherdata is for an organization looking to start real world data procurement auction and an organization ready to offer data at competitive prices.
an account
approve Bids
E-auctions for pharmaceutical companies and start-ups offer a streamlined and cost-efficient method to procure high-quality datasets from healthcare providers. By using a digital auction platform, these companies can invite multiple providers to bid, including their existing vendors, driving down costs through competitive pricing while ensuring transparency and speed in the procurement process. Clear definitions of data requirements, robust evaluation criteria, and secure, compliant handling of bids are essential to maximize the benefits of e-auctions, ultimately enabling data consumers to receive the best possible bid.
To become a data provider on our platform, you can start by visiting our website and signing up for an account. During the quick sign-up process, you will be prompted to provide some basic information about yourself or your organization. Once you’ve completed the sign-up, you’ll gain access to our platform and our partnerships team will reach out to start the vetting process. Meantime, you are encourage to submit data offerings via simple data offering forms to speed the onboarding process. We welcome data providers from various industries and domains and are always looking to expand our dataset offerings to better serve our users. If you have any specific questions or require assistance during the vetting process, please don’t hesitate to contact our partnerships team directly.
To become a data consumer on our platform and to be able to launch the e-auction campaign, you can start by visiting our website and signing up for an account. During the quick sign-up process, you will be prompted to provide some basic information about yourself or your organization. Once you’ve completed the sign-up, you’ll gain access to our platform. As a registered data consumer, you can submit a data request via a simple data request form. Once you submit the data request, our partnerships team will reach out to start the vetting process to initiate the e-auction. If you have any specific questions or require assistance during the vetting process, please don’t hesitate to contact our partnerships team directly.
Since we work with hundreds of data providers who update their collections regularly, our approach to finding a right data partner is different from standard data repositories. Our platform features AI-based e-auction tool to help you discover datasets relevant to your specific requirements in real time. Submit data requests quickly, review and launch e-auction under conditions you prefer, receive bids from multiple providers in a fraction of the time it would take through traditional methods. More on the approach here.
Hypherdata is building a proprietary database of datasets, however, to date, most of the datasets sold are through third party data providers. This allows us to offer a diverse range of datasets covering various medical domains, data modalities and quantities, which is currently impossible to provide via a single database. Choose from a diverse range of data sources, geographies, and formats, including electronic health records, imaging, claims data, patient registries, and more, tailored to your specific research needs. These collections include structured, semi-structured, and unstructured data to cater to different analytical needs.
We uphold data providers accountable for maintaining the quality and accuracy of the datasets they promote on our platform. To ensure this, we implement measures such as allowing for sample verification and spreading data delivery across several milestones or deliverables. Additionally, we facilitate a dispute period during which discrepancies or issues with the data can be raised and resolved. By holding data providers responsible and providing mechanisms for verification and dispute resolution, we strive to maintain high standards of data quality and accuracy for our users.
In the rare event that you encounter data inconsistencies or errors in a purchased dataset, we provide mechanisms to address such issues promptly. Our datasets are delivered in phases, allowing for thorough review and verification. During this period, if you identify any inconsistencies or errors, you can raise them within the dispute period. We take data quality seriously and strive to resolve any discrepancies efficiently. Our support team is available to assist you throughout the process, ensuring a smooth resolution and your satisfaction with the purchased dataset.
Our datasets are updated continuously, reflecting the real-time nature of the data we offer. Our providers collect real-world datasets continually, ensuring that consumers have access to the most up-to-date information. In addition to real-time data, we also provide historical datasets, offering insights into past trends and patterns. This approach enables users to subscribe to and access both historical and real-time data, facilitating the creation of real-world evidence for various applications and analyses.
Instead of engaging in time-consuming contract negotiations with multiple users, individually. Hypherdata offers standard licensing agreements to conclude projects faster. We ensure transparency and compliance with licensing agreements to protect the rights of both data providers and consumers. However, if customization is required, our platform module allows for uploading and editing specific terms and agreements, while allowing the efficiency of editing the documents from one place.
Yes, it is a standard practice to access sample data or preview a dataset before making a purchase. This allows you to evaluate the quality, structure, and relevance of the dataset to ensure that it meets your needs before committing to a purchase.
The usage rights and restrictions for each dataset are specified in the licensing agreements provided by the data providers. While some datasets may have restrictions on commercial use or redistribution, others may allow more flexibility depending on the terms of the license. We encourage data consumers to review the licensing agreements carefully to ensure compliance with usage restrictions.
Data privacy and security are top priorities for us. We adhere to strict data protection regulations and industry best practices to safeguard the confidentiality, integrity, and availability of the datasets on our platform. We implement robust security measures, encryption protocols, access controls, and data anonymization techniques to protect sensitive information and mitigate security risks.
Our platform is accessible globally, and we strive to make our datasets available to users worldwide. However, certain datasets may be subject to geographical restrictions or licensing limitations imposed by data providers. We provide information about any such restrictions or limitations on our platform to help you make informed decisions about accessing or purchasing datasets.
Our datasets are priced dynamically, with factors such as complexity, size, and value of the data, as well as the licensing terms, influencing the pricing. Additionally, we employ a bidding system where data providers can place bids on datasets they’re interested to collect. This competitive bidding process helps drive the price down, ensuring that data consumers can access datasets at competitive rates. We strive to offer transparent pricing models that accommodate various budgets and needs, while also promoting fair competition among users.
Generally, our pricing for data providers is based on a revenue sharing model. The percentages may vary depending on the type of dataset, licensing terms, and other factors. We strive to offer competitive revenue sharing that reflects the value of the datasets and ensures fair compensation for our data providers. Hypherdata fee is included in the final price of the dataset.
Our pricing model for data consumers is designed to provide flexibility and transparency throughout the data procurement process. Submitting a data request and receiving first feedback from our team is free of charge. After the initial feedback from our team, the data consumer decides to launch the e-auction and publish the data request to receive bids from data providers, Hypherdata will charge a publishing fee. This fee covers the administrative costs associated with managing the e-auction process and facilitating communication between data consumers and providers. It’s important to note that licensing real-world data is rarely happening off the shelf, and this fee helps cover the additional complexities involved in negotiating and finalizing data licensing agreements. If you have any questions about our pricing model or require further clarification, please don’t hesitate to contact our support team.
Yes, we provide support and assistance for data integration and analysis to help you maximize the value of the datasets you purchase from our platform. Our team of third party experts is available to assist you with data integration, API access, data processing, analysis tools, and any other technical or analytical support you may need to leverage the datasets effectively for your projects.
Power your RWD knowledge
Unlocking the Potential of Health Data: The EU’s Data Strategy and Health Innovation
Europe’s data strategy envisions a future where data drives global competitiveness and sovereignty. Through common European data spaces, the EU empowers companies and individuals to retain control over…
The Road to Precision: Addressing Challenges for AI Solutions in Healthcare
In recent years, artificial intelligence (AI) has emerged as a transformative technology in the field of healthcare. With the potential to revolutionize predictive and diagnostic outcomes, to impact…
Hypherdata is founded by a scientist
While working on Huntington’s disease treatment, she faced a lack of data and cooperation. So Jana Miniarikova decided to find a remedy for this problem.
Being solitary costs…
Find the right matches for data challenges
Datasets are collected every day. You can see it trending in any sector today. But, are you aware data you accumulate can be re-used? For example, in life…
We simplify finding the right solution
Ever longed to find something relevant to help you with the work? To be more efficient, receive relevant information or a tool to help you. Unfortunately, browsing the…