*This page contains
links to the latest (and most complete) revision of each work.
*

- Research Papers:

- Mahdi Cheraghchi, Ryan Gabrys, Olgica Milenkovic, and João Ribeiro Coded Trace Reconstruction.
In
*Proceedings of the IEEE Information Theory Workshop (ITW 2019)*, 2019.

Abstract: Motivated by average-case trace reconstruction and coding for portable DNA-based storage systems, we initiate the study ofcoded trace reconstruction, the design and analysis of high-rate efficiently encodable codes that can be efficiently decoded with high probability from few reads (also calledtraces) corrupted by edit errors. [...]

- Mahdi Cheraghchi, Valentine Kabanets, Zhenjian Lu, Dimitrios Myrisiotis.
Circuit Lower Bounds for MCSP from Local Pseudorandom Generators.
In
*Proceedings of the 46th International Colloquium on Automata, Languages and Programming (ICALP 2019)*, 2019.

- Fuchun Lin, Reihaneh Safavi-Naini, Mahdi Cheraghchi, Huaxiong Wang.
Non-Malleable Codes against Active Physical Layer Adversary.
In
*Proceedings of the IEEE International Symposium on Information Theory (ISIT 2019)*, 2019.

- Mahdi Cheraghchi, João Ribeiro Simple Codes and Sparse Recovery with Fast Decoding.
In
*Proceedings of the IEEE International Symposium on Information Theory (ISIT 2019)*, 2019.

Abstract: Construction of error-correcting codes achieving a designated minimum distance parameter is a central problem in coding theory. A classical and algebraic family of error-correcting codes studied for this purpose are the BCH codes. In this work, we study a very simple construction of linear codes that achieve a given distance parameter K.

- Fuchun Lin, Mahdi Cheraghchi, Venkatesan Guruswami, Reihaneh Safavi-Naini, Huaxiong Wang. Secret Sharing with Binary Shares.
In Proceedings of the 10th Innovations in Theoretical Computer Science Conference (ITCS), 2019.

Abstract: [...] In this work, we study secret sharing in the extremal case of bit-long shares, where even ramp secret sharing becomes impossible. We show, however, that a slightly relaxed but equally effective notion of semantic security for the secret, and negligible reconstruction error probability, eliminates the impossibility. Moreover, we provide explicit constructions of such schemes. Our relaxation results in separation of adaptive and non-adaptive adversaries. [...].

- Mahdi Cheraghchi, João Ribeiro . Sharp Analytical Capacity Upper Bounds for Sticky and Related Channels.
In Proceedings of the 56th Annual Allerton Conference on Communication, Control, and Computing, 2018.

- Journal version: "Sharp Analytical Capacity Upper Bounds for Sticky and Related Channels
". IEEE Transcations on Information Theory, 2019.
DOI:10.1109/TIT.2019.2920375.

Abstract: We study natural examples of binary channels with synchronization errors. These include the duplication channel, which independently outputs a given bit once or twice, and geometric channels that repeat a given bit according to a geometric rule, with or without the possibility of bit deletion. [...].

- Mahdi Cheraghchi. Capacity Upper Bounds for Deletion-type Channels.
*In Proceedings of the 50th ACM Symposium on Theory of Computing (STOC 2018), 2018.*

- Journal version: "Capacity Upper Bounds for Deletion-type Channels
". Journal of the ACM (JACM), Volume 66 Issue 2, Article 9, 2019. DOI:
10.1145/3281275.

Abstract: We develop a systematic approach, based on convex programming and real analysis, for obtaining upper bounds on the capacity of the binary deletion channel and, more generally, channels with i.i.d. insertions and deletions. Other than the classical deletion channel, we give a special attention to the Poisson-repeat channel introduced by Mitzenmacher and Drinea (IEEE Transactions on Information Theory, 2006).Our framework can be applied to obtain capacity upper bounds for any repetition distribution (the deletion and Poisson-repeat channels corresponding to the special cases of Bernoulli and Poisson distributions). Our techniques essentially reduce the task of proving capacity upper bounds to maximizing a univariate, real-valued, and often concave function over a bounded interval. The corresponding univariate function is carefully designed according to the underlying distribution of repetitions and the choices vary depending on the desired strength of the upper bounds as well as the desired simplicity of the function (e.g., being only efficiently computable versus having an explicit closed-form expression in terms of elementary, or common special, functions). Among our results, we show the following:

1. The capacity of the binary deletion channel with deletion probability $d$ is at most $(1-d) \log \varphi$ for $d \geq 1/2$, and, assuming the capacity function is convex, is at most $1-d \log(4/\varphi)$ for $d<1/2$, where $\varphi=(1+\sqrt{5})/2$ is the golden ratio. This is the first nontrivial capacity upper bound for any value of $d$ outside the limiting case $d \to 0$ that is fully explicit and proved without computer assistance.

2. We derive the first set of capacity upper bounds for the Poisson-repeat channel. Our results uncover further striking connections between this channel and the deletion channel, and suggest, somewhat counter-intuitively, that the Poisson-repeat channel is actually analytically simpler than the deletion channel and may be of key importance to a complete understanding of the deletion channel.

3. We derive several novel upper bounds on the capacity of the deletion channel. All upper bounds are maximums of efficiently computable, and concave, univariate real functions over a bounded domain. In turn, we upper bound these functions in terms of explicit elementary and standard special functions, whose maximums can be found even more efficiently (and sometimes, analytically, for example for $d=1/2$).

Along the way, we develop several new techniques of potentially independent interest.

- Mahdi Cheraghchi, João Ribeiro Improved Capacity Upper Bounds for the Discrete-Time Poisson Channel.
In
*Proceedings of the IEEE International Symposium on Information Theory (ISIT 2018)*, 2018.

- Journal version: "Improved Upper Bounds and Structural Results on the Capacity of the Discrete-Time Poisson Channel
". IEEE Transcations on Information Theory, 2019. DOI:
10.1109/TIT.2019.2896931.

Abstract: We present new capacity upper bounds for the discrete-time Poisson channel with no dark current and an average-power constraint. These bounds are a simple consequence of techniques developed by one of the authors for the seemingly unrelated problem of upper bounding the capacity of binary deletion and repetition channels. Previously, the best known capacity upper bound in the regime where the average-power constraint does not approach zero was due to Martinez (JOSA B, 2007), which we re-derive as a special case of our framework. Furthermore, we instantiate our framework to obtain a closed-form bound that noticeably improves the result of Martinez everywhere.

- Mahdi Cheraghchi. Expressions for the entropy of binomial-type distributions.
In
*Proceedings of the IEEE International Symposium on Information Theory (ISIT 2018)*, 2018.

- Journal version: "Expressions for the Entropy of Basic Discrete Distributions
". IEEE Transcations on Information Theory, 2019. DOI:
10.1109/TIT.2019.2900716.

Abstract: We develop a general method for computing logarithmic and log-gamma expectations of distributions. As a result, we derive series expansions and integral representations of the entropy for several fundamental distributions, including the Poisson, binomial, beta-binomial, negative binomial, and hypergeometric distributions. Our results also establish connections between the entropy functions and to the Riemann zeta function and its generalizations.

- Thach V. Bui,
Minoru Kuribayashi, Mahdi Cheraghchi, Isao Echizen. Efficiently Decodable Non-Adaptive Threshold Group Testing.
In
*Proceedings of the IEEE International Symposium on Information Theory (ISIT 2018)*, 2018.

- Journal version: "Efficiently Decodable Non-Adaptive Threshold Group Testing
". IEEE Transcations on Information Theory, 2019. DOI:
10.1109/TIT.2019.2907990.

Abstract: We consider non-adaptive threshold group testing for identification of up to d defective items in a set of N items, where a test is positive if it contains at least 2 ≤ u ≤ d defective items, and negative otherwise. [...]

- K. Chandrasekaran, M. Cheraghchi,
V. Gandikota, E. Grigorescu. Local Testing for
Membership in Lattices.
*In Proceedings of the 36th Foundations of Software Technology and Theoretical Computer Science conference**(FSTTCS 2016)**,*2016.

- Journal version:
Local Testing of Lattices.
SIAM Journal on Discrete Mathematics 32(2), pp. 1265-1295, 2018.

Abstract: Motivated by the structural analogies between point lattices and linear error-correcting codes, and by the mature theory on locally testable codes, we initiate a systematic study of local testing for membership in lattices.

Testing membership in lattices is also motivated in practice, by applications to integer programming, error detection in lattice-based communication, and cryptography. Apart from establishing the conceptual foundations of lattice testing, our results include the following: 1. We demonstrate upper and lower bounds on the query complexity of local testing for the well-known family of code formula lattices. Furthermore, we instantiate our results with code formula lattices constructed from Reed-Muller codes, and obtain nearly-tight bounds. 2. We show that in order to achieve low query complexity, it is sufficient to design one-sided non-adaptive canonical tests. This result is akin to, and based on an analogous result for error-correcting codes due to Ben-Sasson et al. (SIAM J. Computing 35(1) pp1-21).

- Mahdi Cheraghchi, Elena
Grigorescu, Brendan Juba, Karl Wimmer, Ning Xie. AC0-MOD2 lower bounds for the
Boolean Inner Product. In
*Proceedings of the 43rd International Colloquium on Automata, Languages and Programming (ICALP 2106)*, 2016.

- Journal version:
AC0-MOD2 lower bounds for the
Boolean Inner Product.
Journal of Computer and System Sciences
vol. 97, pp. 45-59, 2018.

Abstract: AC0-MOD2 circuits are AC0 circuits augmented with a layer of parity gates just above the input layer. We study the AC0-MOD2 circuit lower bound for computing the Boolean Inner Product functions. Recent works by Servedio and Viola (ECCC TR12-144) and Akavia et al. (ITCS 2014) have highlighted this problem as a frontier problem in circuit complexity that arose both as a first step towards solving natural special cases of the matrix rigidity problem and as a candidate for constructing pseudorandom generators of minimal complexity.

We give the first superlinear lower bound for the Boolean Inner Product function against AC0-MOD2 of depth four or greater. Indeed, we prove a superlinear lower bound for circuits of arbitrary constant depth, and an Ω~(n^{2}) lower bound for the special case of depth-4 AC0-MOD2. Our proof of the depth-4 lower bound employs a new "moment-matching" inequality for bounded, nonnegative integer-valued random variables that may be of independent interest: we prove an optimal bound on the maximum difference between two discrete distributions' values at 0, given that their first d moments match.

- Mahdi Cheraghchi. Nearly Optimal Robust Secret
Sharing. In
*Proceedings of the IEEE International Symposium on Information Theory (ISIT 2016)*, 2016.

- Journal version:
Nearly Optimal Robust Secret
Sharing.
Designs, Codes and Cryptography,
DOI:10.1007/s10623-018-0578-y (open access), 2018.

Abstract: We prove that a known approach to improve Shamir's celebrated secret sharing scheme; i.e., adding an information-theoretic authentication tag to the secret, can make it robust for $n$ parties against any collusion of size $\delta n$, for any constant $\delta \in (0, 1/2)$.

This result holds in the so-called "non-rushing" model in which the $n$ shares are submitted simultaneously for reconstruction. We thus obtain an efficient and robust secret sharing scheme in this model that is essentially optimal in all parameters including the share size which is $k(1+o(1)) + O(\kappa)$, where $k$ is the secret length and $\kappa$ is the security parameter. Like Shamir's scheme, in this modified scheme any set of more than $\delta n$ honest parties can efficiently recover the secret. Using algebraic geometry codes instead of Reed-Solomon codes, we decrease the share length to a constant (only depending on $\delta$) while the number of shares $n$ can grow independently. In this case, when $n$ is large enough, the scheme satisfies the "threshold" requirement in an approximate sense; i.e., any set of $\delta n(1+\rho)$ honest parties, for arbitrarily small $\rho > 0$, can efficiently reconstruct the secret.

- Mahdi Cheraghchi, Piotr
Indyk. Nearly Optimal Deterministic
Algorithm for Sparse Walsh-Hadamard Transform.
In
*Proceedings of the ACM-SIAM Symposium on Discrete Algorithms (SODA 2016)*. arXiv:1504.07648, 2016. - Journal version: "Nearly
Optimal Deterministic Algorithm for Sparse
Walsh-Hadamard Transform". ACM Transactions on
Algorithms (TALG), Volume 13 Issue 3, 2017. DOI:
10.1145/3029050.

Abstract: For every fixed constant α > 0, we design an algorithm for computing thek-sparse Walsh-Hadamard transform of anN-dimensional vectorx∈ R^{N}in timek^{1+α}(logN)^{O(1)}. Specifically, the algorithm is given query access toxand computes ak-sparsex' ∈ R^{N}satisfying || x' - DHT(x) ||_{1}< c || DHT(x) -H(DHT(_{k}x)) ||_{1}for an absolute constant c > 0, where DHT(x) is the transform ofxandH(DHT(_{k}x)) is its bestk-sparse approximation. Our algorithm is fully deterministic and only uses non-adaptive queries tox(i.e., all queries are determined and performed in parallel when the algorithm starts).

An important technical tool that we use is a construction of nearly optimal and linear lossless condensers which is a careful instantiation of the GUV condenser (Guruswami, Umans, Vadhan, JACM 2009). Moreover, we design a deterministic and non-adaptive L1/L1 compressed sensing scheme based on general lossless condensers that is equipped with a fast reconstruction algorithm running in timek^{1+α}(logN)^{O(1)}(for the GUV-based condenser) and is of independent interest. Our scheme significantly simplifies and improves an earlier expander-based construction due to Berinde, Gilbert, Indyk, Karloff, Strauss (Allerton 2008).

Our methods use linear lossless condensers in a black box fashion; therefore, any future improvement on explicit constructions of such condensers would immediately translate to improved parameters in our framework (potentially leading tok(logN)^{O(1)}reconstruction time with a reduced exponent in the poly-logarithmic factor).

- Mahdi Cheraghchi,
Venkatesan Guruswami. Non-Malleable Coding Against
Bit-wise and Split-State Tampering. In
*Proceedings of Theory of Cryptography Conference (TCC 2014).*ECCC TR13-121, 2014. - Journal version: "Non-Malleable
Coding Against Bit-wise and Split-State Tampering".
Journal of Cryptology, 2015. DOI:
10.1007/s00145-015-9219-z.

Abstract: Non-malleable coding, introduced by Dziembowski, Pietrzak and Wichs (ICS 2010), aims for protecting the integrity of information against tampering attacks in situations where error-detection is impossible. Intuitively, information encoded by a non-malleable code either decodes to the original message or, in presence of any tampering, to an unrelated message. Non-malleable coding is possible against any class of adversaries of bounded size. In particular, Dziembowski et al. show that such codes exist and may achieve positive rates for any class of tampering functions of size at most exp(2^{α n}), for any constant α < 1. However, this result is existential and has thus attracted a great deal of subsequent research on explicit constructions of non-malleable codes against natural classes of adversaries.

In this work, we consider constructions of coding schemes against two well-studied classes of tampering functions; namely, bit-wise tampering functions (where the adversary tampers each bit of the encoding independently) and the much more general class of split-state adversaries (where two independent adversaries arbitrarily tamper each half of the encoded sequence). We obtain the following results for these models.

1. For bit-tampering adversaries, we obtain explicit and efficiently encodable and decodable non-malleable codes of length $n$ achieving rate 1-o(1) and error (also known as "exact security") exp(-Ω~(n^{1/7})). Alternatively, it is possible to improve the error to exp(-Ω~(n))$ at the cost of making the construction Monte Carlo with success probability 1-exp(-Ω(n)) (while still allowing a compact description of the code). Previously, the best known construction of bit-tampering coding schemes was due to Dziembowski et al. (ICS 2010), which is a Monte Carlo construction achieving rate close to .1887.

2. We initiate the study of seedless non-malleable extractors as a natural variation of the notion of non-malleable extractors introduced by Dodis and Wichs (STOC 2009). We show that construction of non-malleable codes for the split-state model reduces to construction of non-malleable two-source extractors. We prove a general result on existence of seedless non-malleable extractors, which implies that codes obtained from our reduction can achieve rates arbitrarily close to 1/5 and exponentially small error. In a separate recent work, the authors show that the optimal rate in this model is 1/2. Currently, the best known explicit construction of split-state coding schemes is due to Aggarwal, Dodis and Lovett (ECCC TR13-081) which only achieves vanishing (polynomially small) rate.

- Mahdi Cheraghchi,
Venkatesan Guruswami.
Capacity of Non-Malleable Codes. In
*Proceedings of Innovations in Theoretical Computer Science (ITCS 2014)*. ECCC TR13-118, 2014.

- Journal version: "Capacity
of Non-Malleable Codes.". IEEE Transactions on
Information Theory 62(3), pp 1097-1118, 2016.

Abstract: Non-malleable codes, introduced by Dziembowski, Pietrzak and Wichs (ICS 2010), encode messagessin a manner so that tampering the codeword causes the decoder to either outputsor a message that is independent ofs. While this is an impossible goal to achieve against unrestricted tampering functions, rather surprisingly non-malleable coding becomes possible against every fixed familyFof tampering functions that is not too large (for instance, when |F| < exp(2^{α n}) for someα< 1 wherenis the number of bits in a codeword. In this work, we study the "capacity of non-malleable coding," and establish optimal bounds on the achievable rate as a function of the family size, answering an open problem from Dziembowski et al. (ICS 2010). Specifically,

As a corollary, this implies that the capacity of non-malleable coding in the split-state model (where the tampering function acts independently but arbitrarily on the two halves of the codeword, a model which has received some attention recently) equals 1/2.

- We prove that for every family
Fwith |F| < exp(2^{α n}), there exist non-malleable codes againstFwith rate arbitrarily close to 1-α(this is achieved w.h.p. by

a randomized construction).- We show the existence of families of size exp(poly(n) 2
^{α n}) against which there is no non-malleable code of rate 1-α(in fact this is the case w.h.p for a random family of this size).- We also show that 1-
αis the best achievable rate for the family of functions which are only allowed to tamper the firstα nbits of the codeword, which is of special interest.

We also give an efficient Monte Carlo construction of codes of rate close to 1 with polynomial time encoding and decoding that is non-malleable against any fixedc> 0 and familyFof size exp(n^{c}), in particular tampering functions with say cubic size circuits.

- Mahdi Cheraghchi, Anna Gál, Andrew Mills. Correctness and Corruption of Locally Decodable Codes. Manuscript. ECCC TR12-172, 2012.

Abstract: Locally decodable codes (LDCs) are error correcting codes with the extra property that it is sufficient to read just a small number of positions of a possibly corrupted codeword in order to recover any one position of the input. To achieve this, it is necessary to use randomness in the decoding procedures. We refer to the probability of returning the correct answer as the correctness of the decoding algorithm.

Thus far, the study of LDCs has focused on the question of the tradeoff between their length and the query complexity of the decoders. Another natural question is what is the largest possible correctness, as a function of the amount of codeword corruption and the number of queries, regardless of the length of the codewords. Goldreich et al. (Computational Complexity 15(3), 2006) observed that for a given number of queries and fraction of errors, the correctness probability cannot be arbitrarily close to 1. However, the quantitative dependence between the largest possible correctness and the amount of corruption has not been established before.

We present several bounds on the largest possible correctness for LDCs, as a function of the amount of corruption tolerated and the number of queries used, regardless of the length of the code. Our bounds are close to tight. We also investigate the relationship between the amount of corruption tolerated by an LDC and its minimum distance as an error correcting code.

Even though intuitively the two notions are expected to be related, we demonstrate that in general this is not the case. However, we show a close relationship between minimum distance and

amount of corruption tolerated for linear codes over arbitrary finite fields, and for binary nonlinear codes. We use these results to strengthen the known bounds on the largest possible amount of corruption that can be tolerated by LDCs (with any nontrivial correctness better than random guessing) regardless of the query complexity or the length of the code.

- Mahdi Cheraghchi,
Venkatesan Guruswami, Ameya Velingker. Restricted
Isometry of Fourier Matrices and List Decodability
of Random Linear Codes. In Proceedings of the
ACM-SIAM Symposium on Discrete Algorithms (SODA
2013). arXiv:1207.1140, 2013.

- Journal version: "Restricted Isometry of Fourier Matrices and List Decodability of Random Linear Codes." SIAM Journal on Computing 42(5), pp 1888–1914, 2013.

Abstract: We prove that a random linear code over F_{q}, with probability arbitrarily close to 1, is list decodable at radius 1-1/q-ε with list size L=O(1/ε^{2}) and rateR= Ω_{q}(ε^{2}/(log^{3}(1/ε))). Up to the polylogarithmic factor in 1/ε and constant factors depending onq, this matches the lower boundL=Ω_{q}(1/ε^{2}) for the list size and upper boundR=O_{q}(ε^{2}) for the rate. Previously only existence (and not abundance) of such codes was known for the special caseq=2 (Guruswami, Håstad, Sudan and Zuckerman, 2002).

In order to obtain our result, we employ a relaxed version of the well known Johnson bound on list decoding that translates the average Hamming distance between codewords to list decoding guarantees. We furthermore prove that the desired average-distance guarantees hold for a code provided that a natural complex matrix encoding the codewords satisfies the Restricted Isometry Property with respect to the Euclidean norm (RIP-2). For the case of random binary linear codes, this matrix coincides with a random submatrix of the Hadamard-Walsh transform matrix that is well studied in the compressed sensing literature.

Finally, we improve the analysis of Rudelson and Vershynin (2008) on the number of random frequency samples required for exact reconstruction ofk-sparse signals of lengthN. Specifically, we improve the number of samples fromO(klog(N) log^{2}(k) (logk+ log logN)) toO(klog(N) log^{3}k). The proof involves bounding the expected supremum of a related Gaussian process by using an improved analysis of the metric defined by the process. This improvement is crucial for our application in list decoding.

- Mahdi Cheraghchi, Adam R. Klivans, Pravesh Kothari, Homin K. Lee. Submodular Functions are Noise Stable. In Proceedings of the ACM-SIAM Symposium on Discrete Algorithms (SODA), 2012. arXiv:1106.0518.

Abstract: We show that all non-negative submodular functions have high noise-stability. As a consequence, we obtain a polynomial-time learning algorithm for this class with respect to any product distribution on {-1,1}^{n}(for any constant accuracy parameter ε). Our algorithm also succeeds in the agnostic setting. Previous work on learning submodular functions required either query access or strong assumptions about the types of submodular functions to be learned (and did not hold in the agnostic setting). Additionally we give simple algorithms that efficiently release differentially private answers to all Boolean conjunctions and to all halfspaces with constant average error, subsuming and improving recent work due to Gupta, Hardt, Roth and Ullman (STOC 2011).

- Mahdi Cheraghchi. Coding-Theoretic Methods for Sparse Recovery. In Proceedings of 49th Annual Allerton Conference on Communication, Control, and Computing, 2011 (invited paper).

Abstract:We review connections between coding-theoretic objects and sparse learning problems. In particular, we show how seemingly different combinatorial objects such as error-correcting codes, combinatorial designs, spherical codes, compressed sensing matrices and group testing designs can be obtained from one another. The reductions enable one to translate upper and lower bounds on the parameters attainable by one object to another. We survey some of the well-known reductions in a unified presentation, and bring some existing gaps to attention. New reductions are also introduced; in particular, we bring up the notion of minimum "L-wise distance" of codes and show that this notion closely captures the combinatorial structure of RIP-2 matrices. Moreover, we show how this weaker variation of the minimum distance is related to combinatorial list-decoding properties of codes.

- Mahdi Cheraghchi. Derandomization and Group Testing. In Proceedings of 48th Annual Allerton Conference on Communication, Control, and Computing, 2010 (invited paper).

Abstract:The rapid development of derandomization theory, which is a fundamental area in theoretical computer science, has recently led to many surprising applications outside its initial intention. We will review some recent such developments related to combinatorial group testing. In its most basic setting, the aim of group testing is to identify a set of "positive" individuals in a population of items by taking groups of items and asking whether there is a positive in each group.

In particular, we will discuss explicit constructions of optimal or nearly-optimal group testing schemes using "randomness-conducting" functions. Among such developments are constructions of error-correcting group testing schemes using randomness extractors and condensers, as well as threshold group testing schemes from lossless condensers.

- Mahdi Cheraghchi, Johan Håstad, Marcus Isaksson, Ola Svensson. Approximating Linear Threshold Predicates. In Proceedings of the 13th International Workshop on Approximation Algorithms for Combinatorial Optimization Problems (APPROX), 2010.
- Journal version: "Approximating Linear Threshold Predicates". ACM Transactions on Computation Theory 4(1), Article 2, March 2012.

Abstract:We study constraint satisfaction problems with constraints defined by homogeneous linear threshold predicates. Specifically, we consider the standard optimization problem Max-CSP(P) where the objective is to satisfy as many constraints of the same typeRon a number of Boolean variables as possible. We assume that the constraints are defined by a fixed linear threshold predicateP:{-1,+1}^{n}→ {-1,+1} of the formP(x_{1}, ...,x_{n}) = sgn(w_{1}x_{1}+ ... +w), for positive integer weights_{n}x_{n}w_{i}(i=1, ..., n). Our focus is on a range of threshold predicates for which the problem does not become approximation resistant, and we study the approximation curve of this class of problems. For the special case of the majority function withw_{1}= ... =w= 1 and_{n}nodd, we obtain almost-matching approximability and hardness results that can be summarized as follows:

We extend the above results to a more general class of "majority-like" predicates and obtain parallel results for them. Loosely speaking, this class of predicates can be defined using weights <

Approximation:Using linear programming, we design a polynomial-time algorithm that, given a 1-δ/(n+1) satisfiable instance for any δ < 1, satisfies a 1/2 + (1-δ)^{7}Ω(1/sqrt(n)) fraction of the constraints.Hardness:Assuming the Unique Games Conjecture (Khot, STOC 2002), for any ε>0 and δ in [0,1], given a (1-δ/(n+1) - ε)-satisfiable instance it is NP-hard to satisfy a 1/2 + (1-δ) Ω(1/sqrt(n) + ε fraction of the constraints.w_{i}that are generally small. Towards this, we introduce the notion ofChow-robustnessthat might be of independent interest.

- Mahdi Cheraghchi. Improved Constructions for Non-adaptive Threshold Group Testing. In Proceedings of the 37th International Colloquium on Automata, Languages and Programming (ICALP), 2010. (video of a talk about this given at IMA, Minnesota MN, February 2012)
- Journal version:
"Improved Constructions for Non-adaptive Threshold
Group Testing". Algorithmica, special issue on "Algorithms and Data
Structures for selection, identification and
encoding ---Group
testing, compressed sensing, multi-access
communication and more", 2013. DOI:
10.1007/s00453-013-9754-7.

Abstract:The basic goal in combinatorial group testing is to identify a set of up to d defective items within a large population of size n >> d using a pooling strategy. Namely, the items can be grouped together in pools, and a single measurement would reveal whether there are one or more defectives in the pool. The threshold model is a generalization of this idea where a measurement returns positive if the number of defectives in the pool passes a fixed thresholdu, negative if this number is below a fixed lower thresholdL≤u, and may behave arbitrarily otherwise. We study non-adaptive threshold group testing (in a possibly noisy setting) and show that, for this problem, O(d^{g+2}(logd) log(n/d)) measurements (whereg:=u-L) suffice to identify the defectives, and also present almost matching lower bounds. This significantly improves the previously known (non-constructive) upper bound O(d^{u+1}log(n/d)). Moreover, we obtain a framework for explicit construction of measurement schemes using lossless condensers. The number of measurements resulting from this scheme is ideally bounded by O(d^{g+3}(logd) logn). Using state-of-the-art constructions of lossless condensers, however, we come up with explicit testing schemes with O(d^{g+3}(logd) quasipoly(logn)) and O(d^{g+3+β}poly(log n)) measurements, for arbitrary constant β > 0.

- Mahdi Cheraghchi, Amin Karbasi, Soheil Mohajer, Vankatesh Saligrama. Graph-Constrained Group Testing. In Proceedings of IEEE International Symposium on Information Theory (ISIT), 2010.
- Journal version: "Graph-Constrained
Group Testing". IEEE Transactions on Information
Theory 58(1), pp 248-262, 2012.

Abstract:Non-adaptive group testing involves grouping arbitrary subsets of n items into different pools. Each pool is then tested and defective items are identified. A fundamental question involves minimizing the number of pools required to identify at mostddefective items. Motivated by applications in network tomography, sensor networks and infection propagation we formulate group testing problems on graphs. Unlike conventional group testing problems each group here must conform to the constraints imposed by a graph. For instance, items can be associated with vertices and each pool is any set of nodes that must be path connected. In this paper we associate a test with a random walk. In this context conventional group testing corresponds to the special case of a complete graph onnvertices.

For interesting classes of graphs we arrive at a rather surprising result, namely, that the number of tests required to identifyddefective items is substantially similar to that required in conventional group testing problems, where no such constraints on pooling is imposed. Specifically, ifT(n) corresponds to the mixing time of the graphG, we show that withm=O(d^{2}T^{2}(n) log(n/d)) non-adaptive tests, one can identify the defective items. Consequently, for the Erdos-Renyi random graphG(n,p), as well as expander graphs with constant spectral gap, it follows thatm=O(d^{2}log^{3}(n)) non-adaptive tests are sufficient to identifyddefective items. We next consider a specific scenario that arises in network tomography and show thatm=O(d^{3}log^{3}(n)) non-adaptive tests are sufficient to identifyddefective items. We also consider noisy counterparts of the graph constrained group testing problem and develop parallel results for these cases.

- Mahdi Cheraghchi, Ali Hormati, Amin Karbasi, Martin Vetterli. Compressed Sensing with Probabilistic Measurements: A Group Testing Solution. In Proceedings of 47th Annual Allerton Conference on Communication, Control, and Computing, 2009.
- Journal version: "Compressed Sensing with Probabilistic Tests: Theory, Design and Application". IEEE Transactions on Information Theory 57(10), pp 7057-7067, 2011.

Abstract:Detection of defective members of large populations has been widely studied in the statistics community under the name "group testing", a problem which dates back to World War II when it was suggested for syphilis screening. There the main interest is to identify a small number of infected people among a large population using collective samples. In viral epidemics, one way to acquire collective samples is by sending agents inside the population. While in classical group testing, it is assumed that the sampling procedure is fully known to the reconstruction algorithm, in this work we assume that the decoder possesses only partial knowledge about the sampling process. This assumption is justified by observing the fact that in a viral sickness, there is a chance that an agent remains healthy despite having contact with an infected person. Therefore, the reconstruction method has to cope with two different types of uncertainty; namely, identification of the infected population and the partially unknown sampling procedure.

In this work, by using a natural probabilistic model for "viral infections", we design non-adaptive sampling procedures that allow successful identification of the infected population with overwhelming probability 1-o(1). We propose both probabilistic and explicit design procedures that require a "small" number of agents to single out the infected individuals. More precisely, for a contamination probability p, the number of agents required by the probabilistic and explicit designs for identification of up tokinfected members is bounded bym= O(k^{2}(logn) /p^{3}) andm= O(k^{2}(log^{2}n) /p^{3}), respectively. In both cases, a simple decoder is able to successfully identify the infected population in time O(mn).

- Mahdi Cheraghchi. Noise-Resilient Group Testing: Limitations and Constructions. In Proceedings of 17th International Symposium on Fundamentals of Computation Theory (FCT), 2009.
- Journal version: "Noise-Resilient Group Testing: Limitations and Constructions", Discrete Applied Mathematics 161(1-2), pp 81-95, 2013. DOI: 10.1016/j.dam.2012.07.022, arXiv:0811.2609.

Abstract:We study combinatorial group testing schemes for learning d-sparse boolean vectors using highly unreliable disjunctive measurements. We consider an adversarial noise model that only limits the number of false observations, and show that any noise-resilient scheme in this model can only approximately reconstruct the sparse vector. On the positive side, we give a general framework for construction of highly noise-resilient group testing schemes using randomness condensers. Simple randomized instantiations of this construction give non-adaptive measurement schemes, withm=O(dlogn) measurements, that allow efficient reconstruction ofd-sparse vectors up to O(d) false positives even in the presence of delta*mfalse positives and Omega(m/d) false negatives within the measurement outcomes, foranyconstant delta < 1. None of these parameters can be substantially improved without dramatically affecting the others. Furthermore, we obtain several explicit (and incomparable) constructions, in particular one matching the randomized trade-off but usingm= O(d^{1+o(1)}logn) measurements. We also obtain explicit constructions that allow fast reconstruction in time poly(m), which would be sublinear innfor sufficiently sparse vectors.

- Mahdi Cheraghchi. Capacity Achieving Codes from Randomness Conductors. In Proceedings of IEEE International Symposium on Information Theory (ISIT), 2009.

Abstract:We give a general framework for construction of small ensembles of capacity achieving linear codes for a wide range of (not necessarily memoryless) discrete symmetric channels, and in particular, the binary erasure and symmetric channels. The main tool used in our constructions is the notion of randomness extractors and lossless condensers that are regarded as central tools in theoretical computer science. Same as random codes, the resulting ensembles preserve their capacity achieving properties under any change of basis. Our methods can potentially lead to polynomial-sized ensembles; however, using known explicit constructions of randomness conductors we obtain specific ensembles whose size is as small as quasipolynomial in the block length. By applying our construction to Justesen's concatenation scheme (Justesen, 1972) we obtain explicit capacity achieving codes for BEC (resp., BSC) with almost linear time encoding and almost linear time (resp., quadratic time) decoding and exponentially small error probability. The explicit code for BEC is defined and capacity achieving for every block length.

- Mahdi Cheraghchi, Feredric Didier, Amin Shokrollahi. Invertible Extractors and Wiretap Protocols. In Proceedings of IEEE International Symposium on Information Theory (ISIT), 2009.
- Journal version: "Invertible Extractors and Wiretap Protocols". IEEE Transactions on Information Theory 58(2), pp 1254-1274, 2012.

Abstract:A wiretap protocol is a pair of randomized encoding and decoding functions such that knowledge of a bounded fraction of the encoding of a message reveals essentially no information about the message, while knowledge of the entire encoding reveals the message using the decoder. In this paper we study the notion of efficiently invertible extractors and show that a wiretap protocol can be constructed from such an extractor. We will then construct invertible extractors for symbol-fixing, affine, and general sources and apply them to create wiretap protocols with asymptotically optimal trade-offs between their rate (ratio of the length of the message versus its encoding) and resilience (ratio of the observed positions of the encoding and the length of the encoding). We will then apply our results to create wiretap protocols for challenging communication problems, such as active intruders who change portions of the encoding, network coding, and intruders observing arbitrary boolean functions of the encoding.

- Ehsan Ardestanizadeh, Mahdi Cheraghchi, Amin Shokrollahi. Bit Precision Analysis for Compressed Sensing. In Proceedings of IEEE International Symposium on Information Theory (ISIT), 2009.

Abstract:This paper studies the stability of some reconstruction algorithms for compressed sensing in terms of thebit precision. Considering the fact that practical digital systems deal with discretized signals, we motivate the importance of the total number of accurate bits needed from the measurement outcomes in addition to the number of measurements. It is shown that if one uses a 2k*nVandermonde matrix with roots on the unit circle as the measurement matrix, O(L+klog(n/k)) bits of precision per measurement are sufficient to reconstruct ak-sparse signalin R ^{n }withdynamic range(i.e., the absolute ratio between the largest and the smallest nonzero coefficients) at most 2^{L}withinLbits of precision, hence identifying its correct support. Finally, we obtain an upper bound on the total number of required bits when the measurement matrix satisfies a restricted isometry property, which is in particular the case for random Fourier and Gaussian matrices. For very sparse signals, the upper bound on the number of required bits for Vandermonde matrices is shown to be better than this general upper bound.

- Mahdi Cheraghchi, Amin Shokrollahi. Almost-Uniform Sampling of Points on High-Dimensional Algebraic Varieties. In Proceedings of 26th International Symposium on Theoretical Aspects of Computer Science (STACS), 2009.

Abstract:We consider the problem of uniform sampling of points on an algebraic variety. Specifically, we develop a randomized algorithm that, given a small set of multivariate polynomials over a sufficiently large finite field, produces a common zero of the polynomials almost uniformly at random. The statistical distance between the output distribution of the algorithm and the uniform distribution on the set of common zeros is polynomially small in the field size, and the running time of the algorithm is polynomial in the description of the polynomials and their degrees provided that the number of the polynomials is a constant.

- Mahdi Cheraghchi, Amin
Shokrollahi, Avi Wigderson.
Computational Hardness and Explicit Constructions of
Error Correcting Codes. In Proceedings of 44th
Allerton Conference on Communication, Control and
Computing (invited paper), 2006.

Abstract:We outline a procedure for using pseudorandom generators to construct binary codes with good properties, assuming the existence of sufficiently hard functions. Specifically, we give a polynomial time algorithm, which for every integers n and k, constructs polynomially many linear codes of block length n and dimension k, most of which achieving the Gilbert-Varshamov bound. The success of the procedure relies on the assumption that the exponential time class of E := DTIME[2^{O(n)}] is not contained in the sub-exponential space class DSPACE[2^{o(n)}]. The methods used in this paper are by now standard within computational complexity theory, and the main contribution of this note is observing that they are relevant to the construction of optimal codes. We attempt to make this note self contained, and describe the relevant results and proofs from the theory of pseudorandomness in some detail.

- Mahdi Cheraghchi. On Matrix Rigidity and the Complexity of Linear Forms. ECCC Technical Report TR05-070. (talk)

Abstract:The rigidity function of a matrix is defined as the minimum number of its entries that need to be changed in order to reduce the rank of the matrix to below a given parameter. Proving a strong enough lower bound on the rigidity of a matrix implies a nontrivial lower bound on the complexity of any linear circuit computing the set of linear forms associated with it. However, although it is shown that most matrices are rigid enough, no explicit construction of a rigid family of matrices is known. In this survey report we review the concept of rigidity and some of its interesting variations as well as several notable results related to that. We also show the existence of highly rigid matrices constructed by evaluation of bivariate polynomials over finite fields.

- Theses:

- Applications of
Derandomization Theory in Coding. Ph.D.
Thesis, EPFL,
Switzerland. [link
to the official copy at the EPFL library]

Abstract:Randomized techniques play a fundamental role in theoretical computer science and discrete mathematics, in particular for the design of efficient algorithms and construction of combinatorial objects. The basic goal in derandomization theory is to eliminate or reduce the need for randomness in such randomized constructions. Towards this goal, numerous fundamental notions have been developed to provide a unified framework for approaching various derandomization problems and to improve our general understanding of the power of randomness in computation. Two important classes of such tools are pseudorandom generators and randomness extractors. Pseudorandom generators transform a short, purely random, sequence into a much longer sequence that looks random, while extractors transform a weak source of randomness into a perfectly random one (or one with much better qualities, in which case the transformation is called a randomness condenser).

In this thesis, we explore some applications of the fundamental notions in derandomization theory to problems outside the core of theoretical computer science, and in particular, certain problems related to coding theory. First, we consider the wiretap channel problem which involves a communication system in which an intruder can eavesdrop a limited portion of the transmissions. We utilize randomness extractors to construct efficient and information-theoretically optimal communication protocols for this model.

Then we consider the combinatorial group testing problem. In this classical problem, one aims to determine a set of defective items within a large population by asking a number of queries, where each query reveals whether a defective item is present within a specified group of items. We use randomness condensers to explicitly construct optimal, or nearly optimal, group testing schemes for a setting where the query outcomes can be highly unreliable, as well as the threshold model where a query returns positive if the number of defectives pass a certain threshold.

Next, we use randomness condensers and extractors to design ensembles of error-correcting codes that achieve the information-theoretic capacity of a large class of communication channels, and then use the obtained ensembles for construction of explicit capacity achieving codes. Finally, we consider the problem of explicit construction of error-correcting codes on the Gilbert-Varshamov bound and extend the original idea of Nisan and Wigderson to obtain a small ensemble of codes, mostly achieving the bound, under suitable computational hardness assumptions.

- Locally Testble
Codes. M.Sc. Thesis, EPFL, Switzerland. (talk)

Abstract:Error correcting codes are combinatorial objects that allow reliable recovery of information in presence of errors by cleverly augmenting the original information with a certain amount of redundancy.

The availability of efficient means of error detection is considered as a fundamental criterion for error correcting codes. Locally testable codes are families of error correcting codes that are highly robust against transmission errors and in addition provide super-efficient (sublinear time) probabilistic algorithms for error detection. In particular, the error detection algorithm probes the received sequence only at a small (or even constant) number of locations.

There seems to be a trade-off between the amount of redundancy and the number of probes for the error detection procedure in locally testable codes. Even though currently best constructions allow reduction of redundancy to a nearly linear amount, it is not clear whether this can be further reduced to linear while preserving a constant number of probes.

We study the formal notion of locally testable codes and survey several major results in this area. We also investigate closely related concepts, and in particular, polynomial low-degree tests and probabilistically checkable proofs.

*Last Updated on June 2019
*