We present the Mellum models family, open-weight code completion models designed for interactive use in JetBrains IDEs. Mellums have 4B parameters, adopt a Llama-style architecture, and are pre-trained on ~4T tokens of permissively licensed, multi-language code. Our studies show that (i) careful data curation and staged training significantly improve the model's quality, (ii) editor-critical capabilities such as context packing are necessary for high-quality suggestions, and (iii) a compact, task-focused model can meet the cost and latency constraints of interactive completion. In the paper, we describe an end-to-end industrial pipeline for producing contextualized in-editor completion: disciplined data governance, multi-stage training that includes fill-in-the-middle and project context via supervised fine-tuning, and alignment via direct preference optimization using feedback from real-world scenarios. Our quality evaluations include both large-scale offline benchmarks and online telemetry from production deployments in JetBrains IDEs. Mellums are released under the Apache-2.0 license on HuggingFace, with a public model card providing a reproducible reference for practitioners. O
Packing peanuts, as defined by Wikipedia, is a common loose-fill packaging and cushioning material that helps prevent damage to fragile items. In this paper, I propose that synthetic data, akin to packing peanuts, can serve as a valuable asset for economic prediction models, enhancing their performance and robustness when integrated with real data. This hybrid approach proves particularly beneficial in scenarios where data is either missing or limited in availability. Through the utilization of Affinity credit card spending and Womply small business datasets, this study demonstrates the substantial performance improvements achieved by employing a hybrid data approach, surpassing the capabilities of traditional economic modeling techniques.
We show that it is perfectly possible to play 'restricted' two-players, two-strategies quantum games proposed originally by Marinatto and Weber having as the only equipment a pack of 10 cards. The 'quantum board' of such a model of these quantum games is an extreme simplification of 'macroscopic quantum machines' proposed by one of the authors in numerous papers that allow to simulate by macroscopic means various experiments performed on two entangled quantum objects
In this early draft, we provide an overview on similarities and differences in the implementation of a paper card-based vaccine credential system and an app-based vaccine credential system. A vaccine credential's primary goal is to regulate entry and ensure safety of individuals within densely packed public locations and workspaces. This is critical for containing the rapid spread of Covid-19 in densely packed public locations since a single individual can infect a large majority of people in a crowd. A vaccine credential can also provide information such as an individual's Covid-19 vaccination history and adverse symptom reaction history to judge their potential impact on the overall health of individuals within densely packed public locations and workspaces. After completing the comparisons, we believe a card-based implementation will benefit regions with less socioeconomic mobility, limited resources, and stagnant administrations. An app-based implementation on the other hand will benefit regions with equitable internet access and lower technological divide. We also believe an interoperable system of both credential systems will work best for regions with enormous working-class
The Swiss Light Source (SLS) has in the order of 500 magnet power supplies (PS) installed, ranging from from 3 A/20 V four-quadrant PS to a 950 A/1000 V two-quadrant 3 Hz PS. All magnet PS have a local digital controller for a digital regulation loop and a 5 MHz optical point-to-point link to the VME level. The PS controller is running a pulse width/pulse repetition regulation scheme, optional with multiple slave regulation loops. Many internal regulation parameters and controller diagnostics are readable by the control system. Industry Pack modules with standard VME carrier cards are used as VME hardware interface with the high control density of eight links per VME card. The low level EPICS interface is identical for all 500 magnet PS, including insertion devices. The digital PS have proven to be very stable and reliable during commissioning of the light source. All specifications were met for all PS. The advanced diagnostic for the magnet PS turned out to be very useful not only for the diagnostic of the PS but also to identify problems on the magnets.
In this thesis we introduce a new type of card shuffle called the one-sided transposition shuffle. At each step a card is chosen uniformly from the pack and then transposed with another card chosen uniformly from below it. This defines a random walk on the symmetric group generated by a distribution which is non-constant on the conjugacy class of transpositions. Nevertheless, we provide an explicit formula for all eigenvalues of the shuffle by demonstrating a useful correspondence between eigenvalues and standard Young tableaux. This allows us to prove the existence of a total-variation cutoff for the one-sided transposition shuffle at time $n\log n$. We also study weighted generalisations of the one-sided transposition shuffle called biased one-sided transposition shuffles. We compute the full spectrum for every biased one-sided transposition shuffle, and prove the existence of a total variation cutoff for certain choices of weighted distribution. In particular, we recover the eigenvalues and well known mixing time of the classical random transposition shuffle. We study the hyperoctahedral group as an extension of the symmetric group, and formulate the one-sided transposition shuf
The Ulam's metric is the minimal number of moves consisting in removal of one element from a permutation and its subsequent reinsertion in different place, to go between two given permutations. Thet elements that are not moved create longest common subsequence of permutations. Aldous and Diaconis, in their paper, pointed that Ulam's metric had been introduced in the context of questions concerning sorting and tossing cards. In this paper we define and study Ulam's metric in highier dimensions: for dimension one the considered object is a pair of permutations, for dimension k it is a pair of k-tuples of permutations. Over encodings by k-tuples of permutations we define two dually related hierarchies. Our very first motivation come from Murata at al. paper, in which pairs of permutations were used as representation of topological relation between rectangles packed into minimal area with application to VLSI physical design. Our results concern hardness, approximability, and parametrized complexity inside the hierarchies.
We introduce a new type of card shuffle called one-sided transpositions. At each step a card is chosen uniformly from the pack and then transposed with another card chosen uniformly from below it. This defines a random walk on the symmetric group generated by a distribution which is non-constant on the conjugacy class of transpositions. Nevertheless, we provide an explicit formula for all eigenvalues of the shuffle by demonstrating a useful correspondence between eigenvalues and standard Young tableaux. This allows us to prove the existence of a total-variation cutoff for the one-sided transposition shuffle at time $n\log n$. We also study a weighted generalisation of the shuffle which, in particular, allows us to recover the well known mixing time of the classical random transposition shuffle.
We propose a model of card shuffling where a pack of cards, spread as points on a square table, are repeatedly gathered locally at random spots and then spread towards a random direction. A shuffling of the cards is then obtained by arranging the cards by their increasing $x$-coordinate values. When there are $m$ cards on the table we show that this random ordering gets mixed in time $O\left(\log m\right)$. Explicit constants are evaluated in a diffusion limit when the position of $m$ cards evolves as an interesting $2m$-dimensional non-reversible reflected jump diffusion in time. Our main technique involves the use of multidimensional Skorokhod maps for double reflections in $[0,1]^2$ in taking the discrete to continuous limit. The limiting computations are then based on the planar Brownian motion and properties of Bessel processes.
We study the cutoff phenomenon for generalized riffle shuffles where, at each step, the deck of cards is cut into a random number of packs of multinomial sizes which are then riffled together.
The game of SET is a popular card game in which the objective is to form Sets using cards from a special deck. In this paper we study single- and multi-round variations of this game from the computational complexity point of view and establish interesting connections with other classical computational problems. Specifically, we first show that a natural generalization of the problem of finding a single Set, parameterized by the size of the sought Set is W-hard; our reduction applies also to a natural parameterization of Perfect Multi-Dimensional Matching, a result which may be of independent interest. Second, we observe that a version of the game where one seeks to find the largest possible number of disjoint Sets from a given set of cards is a special case of 3-Set Packing; we establish that this restriction remains NP-complete. Similarly, the version where one seeks to find the smallest number of disjoint Sets that overlap all possible Sets is shown to be NP-complete, through a close connection to the Independent Edge Dominating Set problem. Finally, we study a 2-player version of the game, for which we show a close connection to Arc Kayles, as well as fixed-parameter tractabilit
Frequently, randomly organized data is needed to avoid an anomalous operation of other algorithms and computational processes. An analogy is that a deck of cards is ordered within the pack, but before a game of poker or solitaire the deck is shuffled to create a random permutation. Shuffling is used to assure that an aggregate of data elements for a sequence S is randomly arranged, but avoids an ordered or partially ordered permutation. Shuffling is the process of arranging data elements into a random permutation. The sequence S as an aggregation of N data elements, there are N! possible permutations. For the large number of possible permutations, two of the possible permutations are for a sorted or ordered placement of data elements--both an ascending and descending sorted permutation. Shuffling must avoid inadvertently creating either an ascending or descending permutation. Shuffling is frequently coupled to another algorithmic function -- pseudo-random number generation. The efficiency and quality of the shuffle is directly dependent upon the random number generation algorithm utilized. A more effective and efficient method of shuffling is to use parameterization to configure th
The Bulgarian solitaire is a mathematical card game played by one person. A pack of n cards is divided into several decks (or "piles"). Each move consists of the removing of one card from each deck and collecting the removed cards to form a new deck. The game ends when the same position occurs twice. It has turned out that when n=k(k+1)/2 is a triangular number, the game reaches the same stable configuration with size of the piles 1,2,...,k. The purpose of the paper is to tell the (quite amusing) story of the game and to discuss mathematical problems related with the Bulgarian solitaire. The paper is dedicated to the memory of Borislav Bojanov (1944-2009), a great mathematician, person, and friend, and one of the main protagonists in the story of the Bulgarian solitaire.
The game of war is one of the most popular international children's card games. In the beginning of the game, the pack is split into two parts, then on each move the players reveal their top cards. The player having the highest card collects both and returns them to the bottom of his hand. The player left with no cards loses. Those who played this game in their childhood did not always have enough patience to wait until the end of the game. A player who has collected almost all the cards can lose all but a few cards in the next 3 minutes. That way the children essentially conduct mathematical experiments observing chaotic dynamics. However, it is not quite so, as the rules of the game do not prescribe the order in which the winning player will put his take to the bottom of his hand: own card, then rival's or vice versa: rival's card, then own. We provide an example of a cycling game with fixed rules. Assume now that each player can seldom but regularly change the returning order. We have managed to prove that in this case the mathematical expectation of the length of the game is finite. In principle it is equivalent to the graph of the game, which has got edges corresponding to all