Research shows synthetic data needs clear guidelines to ensure transparency, accountability, and fairness

Credit: Pixabay/CC0 Public Domain

A new study says clear guidelines need to be established for the generation and processing of synthetic data to ensure transparency, accountability and fairness.

Synthetic data, generated through machine learning algorithms from original real-world data, is gaining attention as it has the potential to provide a privacy-preserving alternative to traditional data sources. This is especially useful when the actual data is too sensitive to share, is missing, or is of too low quality.

Synthetic data differs from real-world data because it is generated by algorithmic models called synthetic data generators, such as generative adversarial networks or Bayesian networks.

The study warns that existing data protection laws that apply only to personal data are not sufficiently equipped to regulate the processing of synthetic data of all kinds.

Laws such as GDPR only apply to the processing of personal data. The GDPR definition of personal data includes “any information relating to an identified or identifiable natural person.” However, not all synthetic datasets are completely artificial. Some may contain personal information or pose a risk of re-identification. Fully synthetic datasets are generally exempt from GDPR rules, unless there is a possibility of re-identification.

It remains unclear what level of re-identification risk is sufficient to trigger an application in the context of fully synthetic data processing. This creates legal uncertainties and practical difficulties in processing such datasets.

The study, by Professor Anna Beduski from the University of Exeter, was published in the journal big data and society.

It says there needs to be clear procedures for holding accountable those responsible for generating and processing synthetic data. We need assurances that synthetic data will not be generated and used in ways that have negative consequences for individuals and society, such as perpetuating existing biases or creating new ones.

Professor Beduschi said: “We need to establish clear guidelines for all types of synthetic data, prioritizing transparency, accountability and fairness. It is especially important as advanced language models such as E3 and GPT-4, which can be trained on or generated from synthetic data, which can facilitate the spread of misleading information and have a negative impact on society. As such, adhering to these principles can help reduce potential harm and foster responsible innovation.

“Therefore, synthetic data should be clearly labeled as such and information about its generation should be provided to the user.”

For more information:
Ana Beduschi, Synthetic Data Protection: Towards a Paradigm Change in Data Regulation?, big data and society (2024). DOI: 10.1177/20539517241231277

Source link

What's Hot

Maximize your search engine rankings with data-driven tools and local SEO

Revolutionize SEO with AI Onsite Optimizer

What is SEO for websites, YouTube and other digital properties?

Research shows synthetic data needs clear guidelines to ensure transparency, accountability, and fairness

Unraveling UN Gaza death toll data

Grindr’s chief privacy officer on the dating app’s data controversies

Everything your parents said about posture is true.For data security

Maximize your search engine rankings with data-driven tools and local SEO

Revolutionize SEO with AI Onsite Optimizer

What is SEO for websites, YouTube and other digital properties?

AI-powered SEO software market [2024-2031] Size, Trends, Sales, Revenue Forecasts HubSpot. Marketo. Oracle – Economica

AMD Ryzen AI CPU beats Intel Core Ultra in AI LLM and GenAI benchmarks, delivers lower power consumption and lower cost with XDNA

Microsoft investigates harmful AI-powered chatbot 'Copilot'

AnkerWork S600 review: An AI-powered speakerphone that actually works

Our Picks

Maximize your search engine rankings with data-driven tools and local SEO

Revolutionize SEO with AI Onsite Optimizer

What is SEO for websites, YouTube and other digital properties?

Most Popular

OnlyFans creator dishes dirt on dating

Anya Taylor-Joy has big plans to rival Gwyneth Paltrow's £197m business Goop as she prepares to launch a lifestyle business

OnlyFans star suffers from online stalking by family member: 'It hurts my stomach'

Subscribe to Updates

What's Hot

Research shows synthetic data needs clear guidelines to ensure transparency, accountability, and fairness

Related Posts