Pseudo-Deterministic Streaming

Goldwasser, Shafi; Grossman, Ofer; Mohanty, Sidhanth; Woodruff, David P.

doi:10.4230/LIPIcs.ITCS.2020.79

File

LIPIcs.ITCS.2020.79.pdf

Filesize: 0.62 MB
25 pages

Document Identifiers

DOI: 10.4230/LIPIcs.ITCS.2020.79
URN: urn:nbn:de:0030-drops-117644

Author Details

Shafi Goldwasser

Simons Institute for the Theory of Computing, Berkeley, CA, USA
MIT, Cambridge, MA, USA

Ofer Grossman

MIT, Cambridge, MA, USA

Sidhanth Mohanty

University of California Berkeley, CA, USA

David P. Woodruff

Carnegie-Mellon University, Pittsburgh, PA, USA

Cite As Get BibTex

Shafi Goldwasser, Ofer Grossman, Sidhanth Mohanty, and David P. Woodruff. Pseudo-Deterministic Streaming. In 11th Innovations in Theoretical Computer Science Conference (ITCS 2020). Leibniz International Proceedings in Informatics (LIPIcs), Volume 151, pp. 79:1-79:25, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2020) https://doi.org/10.4230/LIPIcs.ITCS.2020.79

Abstract

A pseudo-deterministic algorithm is a (randomized) algorithm which, when run multiple times on the same input, with high probability outputs the same result on all executions. Classic streaming algorithms, such as those for finding heavy hitters, approximate counting, ?_2 approximation, finding a nonzero entry in a vector (for turnstile algorithms) are not pseudo-deterministic. For example, in the instance of finding a nonzero entry in a vector, for any known low-space algorithm A, there exists a stream x so that running A twice on x (using different randomness) would with high probability result in two different entries as the output.
In this work, we study whether it is inherent that these algorithms output different values on different executions. That is, we ask whether these problems have low-memory pseudo-deterministic algorithms. For instance, we show that there is no low-memory pseudo-deterministic algorithm for finding a nonzero entry in a vector (given in a turnstile fashion), and also that there is no low-dimensional pseudo-deterministic sketching algorithm for ?_2 norm estimation. We also exhibit problems which do have low memory pseudo-deterministic algorithms but no low memory deterministic algorithm, such as outputting a nonzero row of a matrix, or outputting a basis for the row-span of a matrix.
We also investigate multi-pseudo-deterministic algorithms: algorithms which with high probability output one of a few options. We show the first lower bounds for such algorithms. This implies that there are streaming problems such that every low space algorithm for the problem must have inputs where there are many valid outputs, all with a significant probability of being outputted.

Subject Classification

ACM Subject Classification

Theory of computation → Streaming, sublinear and near linear time algorithms

Keywords

streaming
pseudo-deterministic

Metrics

Access Statistics
Total Accesses (updated on a weekly basis)

0

PDF Downloads

0

Metadata Views

References

Yuqing Ai, Wei Hu, Yi Li, and David P. Woodruff. New Characterizations in Turnstile Streams with Applications. In 31st Conference on Computational Complexity, CCC 2016, May 29 to June 1, 2016, Tokyo, Japan, pages 20:1-20:22, 2016. URL: https://doi.org/10.4230/LIPIcs.CCC.2016.20.
Noga Alon, Yossi Matias, and Mario Szegedy. The space complexity of approximating the frequency moments. Journal of Computer and system sciences, 58(1):137-147, 1999.
Alexandr Andoni, Robert Krauthgamer, and Krzysztof Onak. Streaming algorithms from precision sampling. arXiv preprint, 2010. URL: http://arxiv.org/abs/1011.1263.
Moses Charikar, Kevin Chen, and Martin Farach-Colton. Finding frequent items in data streams. Theoretical Computer Science, 312(1):3-15, 2004.
Kenneth L Clarkson and David P Woodruff. Numerical linear algebra in the streaming model. In Proceedings of the forty-first annual ACM symposium on Theory of computing, pages 205-214. ACM, 2009.
Peter Dixon, A Pavan, and NV Vinodchandran. On Pseudodeterministic Approximation Algorithms. In 43rd International Symposium on Mathematical Foundations of Computer Science (MFCS 2018). Schloss Dagstuhl-Leibniz-Zentrum fuer Informatik, 2018.
Philippe Flajolet. Approximate counting: a detailed analysis. BIT Numerical Mathematics, 25(1):113-134, 1985.
Gereon Frahling, Piotr Indyk, and Christian Sohler. Sampling in dynamic data streams and applications. International Journal of Computational Geometry & Applications, 18(01n02):3-28, 2008.
Eran Gat and Shafi Goldwasser. Probabilistic Search Algorithms with Unique Answers and Their Cryptographic Applications. In Electronic Colloquium on Computational Complexity (ECCC), volume 18, page 136, 2011.
Oded Goldreich. Multi-pseudodeterministic algorithms. In Electronic Colloquium on Computational Complexity (ECCC), 2019.
Oded Goldreich, Shafi Goldwasser, and Dana Ron. On the possibilities and limitations of pseudodeterministic algorithms. In Proceedings of the 4th conference on Innovations in Theoretical Computer Science, pages 127-138. ACM, 2013.
Shafi Goldwasser and Ofer Grossman. Perfect Bipartite Matching in Pseudo-Deterministic RNC. In Electronic Colloquium on Computational Complexity (ECCC), volume 22, page 208, 2015.
Shafi Goldwasser, Ofer Grossman, and Dhiraj Holden. Pseudo-deterministic Proofs. arXiv preprint, 2017. URL: http://arxiv.org/abs/1706.04641.
Parikshit Gopalan and Jaikumar Radhakrishnan. Finding duplicates in a data stream. In Proceedings of the twentieth annual ACM-SIAM symposium on Discrete algorithms, pages 402-411. Society for Industrial and Applied Mathematics, 2009.
Ofer Grossman. Finding Primitive Roots Pseudo-Deterministically. In Electronic Colloquium on Computational Complexity (ECCC), volume 22, page 207, 2015.
Ofer Grossman and Yang P Liu. Reproducibility and Pseudo-Determinism in Log-Space. In Proceedings of the Thirtieth Annual ACM-SIAM Symposium on Discrete Algorithms, pages 606-620. SIAM, 2019.
Moritz Hardt and David P Woodruff. How robust are linear sketches to adaptive inputs? In Proceedings of the forty-fifth annual ACM symposium on Theory of computing, pages 121-130. ACM, 2013.
Dhiraj Holden. A Note on Unconditional Subexponential-time Pseudo-deterministic Algorithms for BPP Search Problems. arXiv preprint, 2017. URL: http://arxiv.org/abs/1707.05808.
Piotr Indyk. Stable distributions, pseudorandom generators, embeddings, and data stream computation. Journal of the ACM (JACM), 53(3):307-323, 2006.
Piotr Indyk and David Woodruff. Optimal approximations of the frequency moments of data streams. In Proceedings of the thirty-seventh annual ACM symposium on Theory of computing, pages 202-208. ACM, 2005.
Rajesh Jayaram and David P Woodruff. Perfect lp sampling in a data stream. In 2018 IEEE 59th Annual Symposium on Foundations of Computer Science (FOCS), pages 544-555. IEEE, 2018.
Hossein Jowhari, Mert Sağlam, and Gábor Tardos. Tight bounds for lp samplers, finding duplicates in streams, and related problems. In Proceedings of the thirtieth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems, pages 49-58. ACM, 2011.
Michael Kapralov, Jelani Nelson, Jakub Pachocki, Zhengyu Wang, David P Woodruff, and Mobin Yahyazadeh. Optimal lower bounds for universal relation, and for samplers and finding duplicates in streams. In Foundations of Computer Science (FOCS), 2017 IEEE 58th Annual Symposium on, pages 475-486. Ieee, 2017.
Yi Li, Huy L Nguyen, and David P Woodruff. Turnstile streaming algorithms might as well be linear sketches. In Proceedings of the forty-sixth annual ACM symposium on Theory of computing, pages 174-183. ACM, 2014.
Jayadev Misra and David Gries. Finding repeated elements. Science of computer programming, 2(2):143-152, 1982.
Morteza Monemizadeh and David P Woodruff. 1-pass relative-error L p-sampling with applications. In Proceedings of the twenty-first annual ACM-SIAM symposium on Discrete Algorithms, pages 1143-1160. Society for Industrial and Applied Mathematics, 2010.
Robert Morris. Counting large numbers of events in small registers. Communications of the ACM, 21(10):840-842, 1978.
Jelani Nelson, Huy L Nguyen, and David P Woodruff. On deterministic sketching and streaming for sparse recovery and norm estimation. Linear Algebra and its Applications, 441:152-167, 2014.
Noam Nisan. Pseudorandom generators for space-bounded computation. Combinatorica, 12(4):449-461, 1992.
Igor C Oliveira and Rahul Santhanam. Pseudodeterministic Constructions in Subexponential Time. arXiv preprint, 2016. URL: http://arxiv.org/abs/1612.01817.
Igor C Oliveira and Rahul Santhanam. Pseudo-Derandomizing Learning and Approximation. In LIPIcs-Leibniz International Proceedings in Informatics, volume 116. Schloss Dagstuhl-Leibniz-Zentrum fuer Informatik, 2018.
Salil P Vadhan et al. Pseudorandomness. Foundations and Trendsregistered in Theoretical Computer Science, 7(1-3):1-336, 2012.

Pseudo-Deterministic Streaming

Authors Shafi Goldwasser, Ofer Grossman, Sidhanth Mohanty, David P. Woodruff

File

Document Identifiers

Author Details

Cite As Get BibTex

Abstract

Subject Classification

ACM Subject Classification

Keywords

Metrics

References

Thanks for your feedback!

Could not send message

Pseudo-Deterministic Streaming

Authors Shafi Goldwasser, Ofer Grossman, Sidhanth Mohanty, David P. Woodruff

File

Document Identifiers

Author Details

Funding

Cite As Get BibTex

Abstract

Subject Classification

ACM Subject Classification

Keywords

Metrics

References

Thanks for your feedback!

Could not send message