Privasis: Synthesizing the Largest "Public" Private Dataset from Scratch
IntermediateHyunwoo Kim, Niloofar Mireshghallah et al.Feb 3arXiv
The paper introduces PRIVASIS, a huge, fully synthetic dataset (1.4 million records) filled with realistic-looking private details, but created from scratch so it does not belong to any real person.
#synthetic dataset#privacy preservation#data sanitization