This paper presents discursive patinas, a technique to visualize discussions onto data visualizations, inspired by how people leave traces in the physical world. While data visualizations are widely discussed in online communities and social media, comments tend to be displayed separately from the visualization and we lack ways to relate these discussions back to the content of the visualization, e.g., to situate comments, explain visual patterns, or question assumptions. In our visualization annotation interface, users can designate areas within the visualization. Discursive patinas are made of overlaid visual marks (anchors), attached to textual comments with category labels, likes, and replies. By coloring and styling the anchors, a meta visualization emerges, showing what and where people comment and annotate the visualization. These patinas show regions of heavy discussions, recent commenting activity, and the distribution of questions, suggestions, or personal stories. We ran workshops with 90 students, domain experts, and visualization researchers to study how people use anchors to discuss visualizations and how patinas influence people's understanding of the discussion. Our
It is sometimes difficult to achieve a complete observation for a full set of observables, and partial observations are necessary. For deterministic systems, the Mori-Zwanzig formalism provides a theoretical framework for handling partial observations. Recently, data-driven algorithms based on the Koopman operator theory have made significant progress, and there is a discussion to connect the Mori-Zwanzig formalism with the Koopman operator theory. In this work, we discuss the effects of partial observation in stochastic systems using the Koopman operator theory. The discussion clarifies the importance of distinguishing the state space and the function space in stochastic systems. Even in stochastic systems, the delay-embedding technique is beneficial for partial observation, and several numerical experiments show a power-law behavior of error with respect to the amplitude of the additive noise. We also discuss the relation between the exponent of the power-law behavior and the effects of partial observation.
To make scientific truth more reliable and qualified, I propose to focus on the chain of discussion in a scientific journal rather than on each original paper. The value, quality and reliability should be judged by the form of the whole discussion chain, not only by an original report. Funding and sponsorship should also give more priority to the extension of the discussion chain. Maintaining the discussion chain and evaluating each scientific truth by the whole chain is the next value of the scientific community in the post-generative AI era.
We make three contributions. First, we formulate a discussion-graph semantics for first-order logic with equality, enabling reasoning about discussion and argumentation in AI more generally than before. This addresses the current lack of a formal reasoning framework capable of handling diverse discussion and argumentation models. Second, we generalise Dung's notion of extensions to cases where two or more graph nodes in an argumentation framework are equivalent. Third, we connect these two contributions by showing that the generalised extensions are first-order characterisable within the proposed discussion-graph semantics. Propositional characterisability of all Dung's extensions is an immediate consequence. We furthermore show that the set of all generalised extensions (acceptability semantics), too, are first-order characterisable. Propositional characterisability of all Dung's acceptability semantics is an immediate consequence.
Online forums encourage the exchange and discussion of different stances on many topics. Not only do they provide an opportunity to present one's own arguments, but may also gather a broad cross-section of others' arguments. However, the resulting long discussions are difficult to overview. This paper presents a novel unsupervised approach using large language models (LLMs) to generating indicative summaries for long discussions that basically serve as tables of contents. Our approach first clusters argument sentences, generates cluster labels as abstractive summaries, and classifies the generated cluster labels into argumentation frames resulting in a two-level summary. Based on an extensively optimized prompt engineering approach, we evaluate 19~LLMs for generative cluster labeling and frame classification. To evaluate the usefulness of our indicative summaries, we conduct a purpose-driven user study via a new visual interface called Discussion Explorer: It shows that our proposed indicative summaries serve as a convenient navigation tool to explore long discussions.
We consider the duration of discussions in face-to-face contacts and propose a stochastic model to describe it. It is based on the points of a Levy flight where the duration of each contact corresponds to the size of the clusters produced during the walk. When confronting it to the data measured from proximity sensors, we show that several datasets obtained in different environments, are precisely reproduced by the model fixing a single parameter, the Levy index, to 1.15. We analyze the dynamics of the cluster formation during the walk and compute analytically the cluster size distribution. We find that discussions are first driven by a maximum-entropy geometric distribution and then by a rich-get-richer mechanism reminiscent of preferential-attachment (the more a discussion lasts, the more it is likely to continue). In this model, conversations may be viewed as an aggregation process with a characteristic scale fixed by the mean interaction time between the two individuals.
Software workplaces are increasingly recognized as key spaces for professional development, where developers encounter various challenges in their roles, which they often discuss in online forums. This paper analyzes 47,368 posts on the Workplace StackExchange site, aggregating developer insights and applying topic modeling techniques. Through manual analysis, we identified 46 distinct topics grouped into seven categories: Employee Wellness, Communication, Career Movement \& Hiring, Conflicts \& Mistakes, Corporate Policies, Management/Supervisor Responsibilities, and Learning \& Technical Skills. Our findings show that approximately 30\% of discussions involve workplace conflicts, marking this as the most prominent topic. Additionally, we found that workplace culture, harassment, and other corporate policy-related issues represent significant areas of difficulty commonly discussed among developers.
Visual language navigation (VLN) is an embodied task demanding a wide range of skills encompassing understanding, perception, and planning. For such a multifaceted challenge, previous VLN methods totally rely on one model's own thinking to make predictions within one round. However, existing models, even the most advanced large language model GPT4, still struggle with dealing with multiple tasks by single-round self-thinking. In this work, drawing inspiration from the expert consultation meeting, we introduce a novel zero-shot VLN framework. Within this framework, large models possessing distinct abilities are served as domain experts. Our proposed navigation agent, namely DiscussNav, can actively discuss with these experts to collect essential information before moving at every step. These discussions cover critical navigation subtasks like instruction understanding, environment perception, and completion estimation. Through comprehensive experiments, we demonstrate that discussions with domain experts can effectively facilitate navigation by perceiving instruction-relevant information, correcting inadvertent errors, and sifting through in-consistent movement decisions. The perfor
Studying misinformation and how to deal with unhealthy behaviours within online discussions has recently become an important field of research within social studies. With the rapid development of social media, and the increasing amount of available information and sources, rigorous manual analysis of such discourses has become unfeasible. Many approaches tackle the issue by studying the semantic and syntactic properties of discussions following a supervised approach, for example using natural language processing on a dataset labeled for abusive, fake or bot-generated content. Solutions based on the existence of a ground truth are limited to those domains which may have ground truth. However, within the context of misinformation, it may be difficult or even impossible to assign labels to instances. In this context, we consider the use of temporal dynamic patterns as an indicator of discussion health. Working in a domain for which ground truth was unavailable at the time (early COVID-19 pandemic discussions) we explore the characterization of discussions based on the the volume and time of contributions. First we explore the types of discussions in an unsupervised manner, and then ch
Software teams are increasingly adopting different tools and communication channels to aid the software collaborative development model and coordinate tasks. Among such resources, Programming Community-based Question Answering (PCQA) forums have become widely used by developers. Such environments enable developers to get and share technical information. Interested in supporting the development and management of Open Source Software (OSS) projects, GitHub announced GitHub Discussions - a native forum to facilitate collaborative discussions between users and members of communities hosted on the platform. As GitHub Discussions resembles PCQA forums, it faces challenges similar to those faced by such environments, which include the occurrence of related discussions (duplicates or near-duplicated posts). While duplicate posts have the same content - and may be exact copies - near-duplicates share similar topics and information. Both can introduce noise to the platform and compromise project knowledge sharing. In this paper, we address the problem of detecting related posts in GitHub Discussions. To do so, we propose an approach based on a Sentence-BERT pre-trained model: the RD-Detector
Van Miltenburg et al. (2021) suggest NLP research should adopt preregistration to prevent fishing expeditions and to promote publication of negative results. At face value, this is a very reasonable suggestion, seemingly solving many methodological problems with NLP research. We discuss pros and cons -- some old, some new: a) Preregistration is challenged by the practice of retrieving hypotheses after the results are known; b) preregistration may bias NLP toward confirmatory research; c) preregistration must allow for reclassification of research as exploratory; d) preregistration may increase publication bias; e) preregistration may increase flag-planting; f) preregistration may increase p-hacking; and finally, g) preregistration may make us less risk tolerant. We cast our discussion as a dialogue, presenting both sides of the debate.
Technical debt (TD) refers to suboptimal choices during software development that achieve short-term goals at the expense of long-term quality. Although developers often informally discuss TD, the concept has not yet crystalized into a consistently applied label when describing issues in most repositories. We apply machine learning to understand developer insights into TD when discussing tickets in an issue tracker. We generate expert labels that indicate whether discussion of TD occurs in the free text associated with each ticket in a sample of more than 1,900 tickets in the Chromium issue tracker. We then use these labels to train a classifier that estimates labels for the remaining 475,000 tickets. We conclude that discussion of TD appears in about 16% of the tracked Chromium issues. If we can effectively classify TD-related issues, we can focus on what practices could be most useful for their timely resolution.
Group discussions are essential for organizing every aspect of modern life, from faculty meetings to senate debates, from grant review panels to papal conclaves. While costly in terms of time and organization effort, group discussions are commonly seen as a way of reaching better decisions compared to solutions that do not require coordination between the individuals (e.g. voting)---through discussion, the sum becomes greater than the parts. However, this assumption is not irrefutable: anecdotal evidence of wasteful discussions abounds, and in our own experiments we find that over 30% of discussions are unproductive. We propose a framework for analyzing conversational dynamics in order to determine whether a given task-oriented discussion is worth having or not. We exploit conversational patterns reflecting the flow of ideas and the balance between the participants, as well as their linguistic choices. We apply this framework to conversations naturally occurring in an online collaborative world exploration game developed and deployed to support this research. Using this setting, we show that linguistic cues and conversational patterns extracted from the first 20 seconds of a team d
We review discourses about the philosophy of science in qualitative research and evidence from cognitive linguistics in order to ground a framework for discussing the use of Large Language Models (LLMs) to support the qualitative analysis process. This framework involves asking two key questions: "is the LLM proposing or refuting a qualitative model?" and "is the human researcher checking the LLM's decision-making directly?". We then discuss an implication of this framework: that using LLMs to surface counter-examples for human review represents a promising space for the adoption of LLMs into the qualitative research process. This space is promising because it is a site of overlap between researchers working from a variety of philosophical assumptions, enabling productive cross-paradigm collaboration on tools and practices.
We consider the secret key agreement problem under the multiterminal source model proposed by Csiszár and Narayan. A single-letter characterization of the secrecy capacity is desired but remains unknown except in the extreme case with unlimited public discussion and without wiretapper's side information. Taking the problem to the opposite extreme by requiring the public discussion rate to be zero asymptotically, we obtain the desired characterization under surprisingly general setting with wiretapper's side information, silent users, trusted and untrusted helpers. An immediate consequence of the result is that the capacity with nearly no discussion is the same as the capacity with no discussion, resolving a previous conjecture in the affirmative. The idea of the proof is to characterize the capacity in the special case with neither wiretapper's side information nor untrusted helpers using a multivariate extension of Gács-Körner common information, and then extend the result to the general setting by a change of scenario that turns untrusted helpers into trusted helpers. We further show how to evaluate the capacity explicitly for finite linear sources and discuss how the current res
I briefly discuss the Martingale Posteriors Distributions paper by Edwing Hong, Chris Holmes and Stephen G. Walker
In April 2019, Aycock et al. published "Sexual harassment reported by undergraduate female physicists" in Phys. Rev. PER. Their main finding is that 3/4 of undergraduate women in physics in the U.S. report experiencing sexual harassment. Gender minorities experience high rates of harassment also. Many physics departments will want to discuss these findings with the ultimate goal of making our field harassment-free. However, there are major challenges inherent in having these discussions, since many participants will have experienced harassment themselves. We suggest questions for discussion organizers to reflect on, and ideas for how departments can approach these discussions.
Artificial Intelligence is becoming part of any technology we use nowadays. If the AI informs people's decisions, the explanation about AI's outcomes, results, and behavior becomes a necessary capability. However, the discussion of XAI features with various stakeholders is not a trivial task. Most of the available frameworks and methods for XAI focus on data scientists and ML developers as users. Our research is about XAI for end-users of AI systems. We argue that we need to discuss XAI early in the AI-system design process and with all stakeholders. In this work, we aimed at investigating how to operationalize the discussion about XAI scenarios and opportunities among designers and developers of AI and its end-users. We took the Signifying Message as our conceptual tool to structure and discuss XAI scenarios. We experiment with its use for the discussion of a healthcare AI-System.
As known, any numerical simulation is composed of two parts: (1) the initial part of writing the relevant code and (2) the running of this code on the computer screen. The second part of running the program is extensively discussed theoretically and technically in the relevant literature. In this work we pay special attention to the less discussed first part and show that it may be discussed in a terminology and notation which describe physical phenomena. As examples we discuss the two cases of simulating (1) the harmonic oscillator and (2) the electron-photon interaction which results in the known Lamb shift.
High quality classroom discussion is important to student development, enhancing abilities to express claims, reason about other students' claims, and retain information for longer periods of time. Previous small-scale studies have shown that one indicator of classroom discussion quality is specificity. In this paper we tackle the problem of predicting specificity for classroom discussions. We propose several methods and feature sets capable of outperforming the state of the art in specificity prediction. Additionally, we provide a set of meaningful, interpretable features that can be used to analyze classroom discussions at a pedagogical level.