Chvatal-Gomory (CG) cuts and the Bienstock-Zuckerberg hierarchy capture useful linear programs that the standard bounded degree Lasserre/Sum-of-Squares SOS hierarchy fails to capture. In this paper we present a novel polynomial time SOS hierarchy for 0/1 problems with a custom subspace of high degree polynomials (not the standard subspace of low-degree polynomials). We show that the new SOS hierarchy recovers the Bienstock-Zuckerberg hierarchy. Our result implies a linear program that reproduces the Bienstock-Zuckerberg hierarchy as a polynomial sized, efficiently constructive extended formulation that satisfies all constant pitch inequalities. The construction is also very simple, and it is fully defined by giving the supporting polynomials. Moreover, for a class of polytopes (e.g. set covering and packing problems), the resulting SOS hierarchy optimizes in polynomial time over the polytope resulting from any constant rounds of CG-cuts, up to an arbitrarily small error. Arguably, this is the first example where different basis functions can be useful in asymmetric situations to obtain a hierarchy of relaxations.
We study a Lagrangian decomposition algorithm recently proposed by Dan Bienstock and Mark Zuckerberg for solving the LP relaxation of a class of open pit mine project scheduling problems. In this study we show that the Bienstock-Zuckerberg (BZ) algorithm can be used to solve LP relaxations corresponding to a much broader class of scheduling problems, including the well-known Resource Constrained Project Scheduling Problem (RCPSP), and multi-modal variants of the RCPSP that consider batch processing of jobs. We present a new, intuitive proof of correctness for the BZ algorithm that works by casting the BZ algorithm as a column generation algorithm. This analysis allows us to draw parallels with the well-known Dantzig-Wolfe (DW) algorithm. We discuss practical computational techniques for speeding up the performance of the BZ and DW algorithms on project scheduling problems. Finally, we present computational experiments independently testing the effectiveness of the BZ and DW algorithms on different sets of publicly available test instances. Our computational experiments confirm that the BZ algorithm significantly outperforms the DW algorithm for the problems considered. Our computat
The molecular biology revolution of the last seventy five years has transformed our view of living systems. Scientific explanations of biological phenomena are now synonymous with the identification of the genes, proteins, and signaling molecules involved. The hegemony of the molecular paradigm has only become more pronounced as new technologies allow us to make measurements at scale. Combining this wealth of data with new ``artificial intelligence'' techniques is viewed as the future of biology. Here, we challenge this emerging ``common sense'', laying out a roadmap for developing a theoretical understanding of life. We argue that a twenty-first century theoretical biology must be founded on a new type of statistical physics suited to the living world. Rather than merely constructing statistical models, a statistical theory requires developing ``quantitative abstractions'' for understanding the gene-organism-environment triad. This necessitates overcoming four major challenges that distinguish living matter: (1) living systems are composed of a large number of heterogeneous parts rather than a large number of identical objects; (2) living systems control and manipulate the physica
In the present work, we consider Zuckerberg's method for geometric convex-hull proofs introduced in [Geometric proofs for convex hull defining formulations, Operations Research Letters 44(5), 625-629 (2016)]. It has only been scarcely adopted in the literature so far, despite the great flexibility in designing algorithmic proofs for the completeness of polyhedral descriptions that it offers. We suspect that this is partly due to the rather heavy algebraic framework its original statement entails. This is why we present a much more lightweight and accessible approach to Zuckerberg's proof technique, building on ideas from [Extended formulations for convex hulls of some bilinear functions, Discrete Optimization 36, 100569 (2020)]. We introduce the concept of set characterizations to replace the set-theoretic expressions needed in the original version and to facilitate the construction of algorithmic proof schemes. Along with this, we develop several different strategies to conduct Zuckerberg-type convex-hull proofs. Very importantly, we also show that our concept allows for a significant extension of Zuckerberg's proof technique. While the original method was only applicable to 0/1-p
The metaverse has seen growing corporate and popular interest over the past few years. While visions vary, the metaverse is generally seen as an extension of the internet that may be developed through advances in a number of digital technologies, such as augmented and virtual reality, as well as new technical infrastructure and standards. The metaverse constitutes an emerging social imaginary, a way of both understanding and directing our shared existence. This paper examines this emerging social imaginary through the phenomenological concept of dwelling, or being at home in the world, as developed by Martin Heidegger. To examine in depth one influential articulation of this social imaginary, this paper focuses on the metaverse as envisioned by Mark Zuckerberg, CEO of Meta (formerly Facebook). The paper presents a thematic analysis of Zuckerberg's public statements regarding the metaverse to provide a close reading of this particular vision. Then, through the lens of Heidegger's philosophy of dwelling, this paper identifies numerous threats to dwelling posed by the metaverse social imaginary. This paper explains these threats and their prognoses, and it closes with some considerati
We describe the CZ Software Mentions dataset, a new dataset of software mentions in biomedical papers. Plain-text software mentions are extracted with a trained SciBERT model from several sources: the NIH PubMed Central collection and from papers provided by various publishers to the Chan Zuckerberg Initiative. The dataset provides sources, context and metadata, and, for a number of mentions, the disambiguated software entities and links. We extract 1.12 million unique string software mentions from 2.4 million papers in the NIH PMC-OA Commercial subset, 481k unique mentions from the NIH PMC-OA Non-Commercial subset (both gathered in October 2021) and 934k unique mentions from 3 million papers in the Publishers' collection. There is variation in how software is mentioned in papers and extracted by the NER algorithm. We propose a clustering-based disambiguation algorithm to map plain-text software mentions into distinct software entities and apply it on the NIH PubMed Central Commercial collection. Through this methodology, we disambiguate 1.12 million unique strings extracted by the NER model into 97600 unique software entities, covering 78% of all software-paper links. We link 1850
We consider lift-and-project methods for combinatorial optimization problems and focus mostly on those lift-and-project methods which generate polyhedral relaxations of the convex hull of integer solutions. We introduce many new variants of Sherali--Adams and Bienstock--Zuckerberg operators. These new operators fill the spectrum of polyhedral lift-and-project operators in a way which makes all of them more transparent, easier to relate to each other, and easier to analyze. We provide new techniques to analyze the worst-case performances as well as relative strengths of these operators in a unified way. In particular, using the new techniques and a result of Mathieu and Sinclair from 2009, we prove that the polyhedral Bienstock--Zuckerberg operator requires at least $\sqrt{2n}- \frac{3}{2}$ iterations to compute the matching polytope of the $(2n+1)$-clique. We further prove that the operator requires approximately $\frac{n}{2}$ iterations to reach the stable set polytope of the $n$-clique, if we start with the fractional stable set polytope. Lastly, we show that some of the worst-case instances for the positive semidefinite Lovász--Schrijver lift-and-project operator are also bad in
In convex integer programming, various procedures have been developed to strengthen convex relaxations of sets of integer points. On the one hand, there exist several general-purpose methods that strengthen relaxations without specific knowledge of the set $ S $, such as popular linear programming or semi-definite programming hierarchies. On the other hand, various methods have been designed for obtaining strengthened relaxations for very specific sets that arise in combinatorial optimization. We propose a new efficient method that interpolates between these two approaches. Our procedure strengthens any convex set $ Q \subseteq \mathbb{R}^n $ containing a set $ S \subseteq \{0,1\}^n $ by exploiting certain additional information about $ S $. Namely, the required extra information will be in the form of a Boolean formula $ φ$ defining the target set $ S $. The aim of this work is to analyze various aspects regarding the strength of our procedure. As one result, interpreting an iterated application of our procedure as a hierarchy, our findings simplify, improve, and extend previous results by Bienstock and Zuckerberg on covering problems.
The Sustainability and Industry Partnership Work Group (SIP-WG) is a part of the National Cancer Institute Informatics Technology for Cancer Research (ITCR) program. The charter of the SIP-WG is to investigate options of long-term sustainability of open source software (OSS) developed by the ITCR, in part by developing a collection of business model archetypes that can serve as sustainability plans for ITCR OSS development initiatives. The workgroup assembled models from the ITCR program, from other studies, and via engagement of its extensive network of relationships with other organizations (e.g., Chan Zuckerberg Initiative, Open Source Initiative and Software Sustainability Institute). This article reviews existing sustainability models and describes ten OSS use cases disseminated by the SIP-WG and others, and highlights five essential attributes (alignment with unmet scientific needs, dedicated development team, vibrant user community, feasible licensing model, and sustainable financial model) to assist academic software developers in achieving best practice in software sustainability.
We consider operators acting on convex subsets of the unit hypercube. These operators are used in constructing convex relaxations of combinatorial optimization problems presented as a 0,1 integer programming problem or a 0,1 polynomial optimization problem. Our focus is mostly on operators that, when expressed as a lift-and-project operator, involve the use of semidefiniteness constraints in the lifted space, including operators due to Lasserre and variants of the Sherali--Adams and Bienstock--Zuckerberg operators. We study the performance of these semidefinite-optimization-based lift-and-project operators on some elementary polytopes --- hypercubes that are chipped (at least one vertex of the hypercube removed by intersection with a closed halfspace) or cropped (all $2^n$ vertices of the hypercube removed by intersection with $2^n$ closed halfspaces) to varying degrees of severity $ρ$. We prove bounds on $ρ$ where these operators would perform badly on the aforementioned examples. We also show that the integrality gap of the chipped hypercube is invariant under the application of several lift-and-project operators of varying strengths.
We study the rank of the Sum of Squares (SoS) hierarchy over the Boolean hypercube for Symmetric Quadratic Functions (SQFs) in $n$ variables with roots placed in points $k-1$ and $k$. Functions of this type have played a central role in deepening the understanding of the performance of the SoS method for various unconstrained Boolean hypercube optimization problems, including the Max Cut problem. Recently, Lee, Prakash, de Wolf, and Yuen proved a lower bound on the SoS rank for SQFs of $Ω(\sqrt{k(n-k)})$ and conjectured the lower bound of $Ω(n)$ by similarity to a polynomial representation of the $n$-bit OR function. Using Chebyshev polynomials, we refute the Lee -- Prakash -- de~Wolf -- Yuen conjecture and prove that the SoS rank for SQFs is at most $O(\sqrt{nk}\log(n))$. We connect this result to two constrained Boolean hypercube optimization problems. First, we provide a degree $O( \sqrt{n})$ SoS certificate that matches the known SoS rank lower bound for an instance of Min Knapsack, a problem that was intensively studied in the literature. Second, we study an instance of the Set Cover problem for which Bienstock and Zuckerberg conjectured an SoS rank lower bound of $n/4$. We re
Currently, there is a limited understanding of how data privacy concerns vary across the world. The Cambridge Analytica scandal triggered a wide-ranging discussion on social media about user data collection and use practices. We conducted an inter-language study of this online conversation to compare how people speaking different languages react to data privacy breaches. We collected tweets about the scandal written in Spanish and English between April and July 2018. We used the Meaning Extraction Method in both datasets to identify their main topics. They reveal a similar emphasis on Zuckerberg's hearing in the US Congress and the scandal's impact on political issues. However, our analysis also shows that while English speakers tend to attribute responsibilities to companies, Spanish speakers are more likely to connect them to people. These findings show the potential of inter-language comparisons of social media data to deepen the understanding of cultural differences in data privacy perspectives.
The metaverse was first introduced in 1992. Many people saw Metaverse as a new word but the concept of Metaverse is not a new term. However, Zuckerberg's press release drew all the attention to the Metaverse. This study presents a bibliometric evaluation of metaverse technology, which has been discussed in the literature since the nineties. A field study is carried out especially for the metaverse, which is a new and trendy subject. In this way, descriptive information is presented on journals, institutions, prominent researchers, and countries in the field, as well as extra evaluation on the prominent topics in the field and researchers with heavy citations. In our study, which was carried out by extracting the data of all documents between the years 1990-2021 from the Web of Science database, it was seen that there were few studies in the literature in the historical process for the metaverse, whose popularity has reached its peak in recent months. In addition, it is seen that the subject is handled intensively with virtual reality and augmented reality technologies, and the education sector and digital marketing fields show interest in the field. Metaverse will probably have ent
We consider the problem of characterizing the convex hull of the graph of a bilinear function $f$ on the $n$-dimensional unit cube $[0,1]^n$. Extended formulations for this convex hull are obtained by taking subsets of the facets of the Boolean Quadric Polytope (BQP). Extending existing results, we propose a systematic study of properties of $f$ that guarantee that certain classes of BQP facets are sufficient for an extended formulation. We use a modification of Zuckerberg's geometric method for proving convex hull characterizations [Geometric proofs for convex hull defining formulations, Operations Research Letters \textbf{44} (2016), 625--629] to prove some initial results in this direction. In particular, we provide small-sized extended formulations for bilinear functions whose corresponding graph is either a cycle with arbitrary edge weights or a clique or an almost clique with unit edge weights.
A valid inequality α^Tx \ge α_0 for a set covering problem is said to have pitch <= k ( a positive integer) if the k smallest positive α_j sum to at least alpha_0. This paper presents a new, simple derivation of a relaxation for set covering problems whose solutions satisfy all valid inequalities of pitch and is of polynomial size, for each fixed . We also consider the minimum knapsack problem, and show that for each fixed integer p > 0 and 0 < ε< 1 one can separate, within additive tolerance ε, from the relaxation defined by the valid inequalities with coefficients in {0, 1, . . . , p} in time polynomial in the number of variables and 1/ε.
Several ongoing international efforts are developing methods of localizing single cells within organs or mapping the entire human body at the single cell level, including the Chan Zuckerberg Initiative's Human Cell Atlas (HCA), and the Knut and Allice Wallenberg Foundation's Human Protein Atlas (HPA), and the National Institutes of Health's Human BioMolecular Atlas Program (HuBMAP). Their goals are to understand cell specialization, interactions, spatial organization in their natural context, and ultimately the function of every cell within the body. In the same way that the Human Genome Project had to assemble sequence data from different people to construct a complete sequence, multiple centers around the world are collecting tissue specimens from a diverse population that varies in age, race, sex, and body size. A challenge will be combining these heterogeneous tissue samples into a 3D reference map that will enable multiscale, multidimensional Google Maps-like exploration of the human body. Key to making alignment of tissue samples work is identifying and using a coordinate system called a Common Coordinate Framework (CCF), which defines the positions, or "addresses", in a refe
This report describes the participation of two Danish universities, University of Copenhagen and Aalborg University, in the international search engine competition on COVID-19 (the 2020 TREC-COVID Challenge) organised by the U.S. National Institute of Standards and Technology (NIST) and its Text Retrieval Conference (TREC) division. The aim of the competition was to find the best search engine strategy for retrieving precise biomedical scientific information on COVID-19 from the largest, at that point in time, dataset of curated scientific literature on COVID-19 -- the COVID-19 Open Research Dataset (CORD-19). CORD-19 was the result of a call to action to the tech community by the U.S. White House in March 2020, and was shortly thereafter posted on Kaggle as an AI competition by the Allen Institute for AI, the Chan Zuckerberg Initiative, Georgetown University's Center for Security and Emerging Technology, Microsoft, and the National Library of Medicine at the US National Institutes of Health. CORD-19 contained over 200,000 scholarly articles (of which more than 100,000 were with full text) about COVID-19, SARS-CoV-2, and related coronaviruses, gathered from curated biomedical sourc