Day 0 (Aug 28th 2024)
15:00-17:00 Tutorial: An Introduction to the Tsetlin Machine. Ole-Christoffer Granmo.
SENSQ 5317@University of Pittsburgh
https://www.tour.pitt.edu/tour/sennott-square
Zoom Webinar link: https://pitt.zoom.us/j/98379804642
The Tsetlin machine is a new universal artificial intelligence (AI) method that learns simple logical rules to understand complex things, similar to how an infant uses logic to learn about the world. Being logical, the rules become understandable to humans. Yet, unlike all other intrinsically explainable techniques, Tsetlin machines are drop-in replacements for neural networks by supporting classification, convolution, regression, reinforcement learning, auto-encoding, language models, and natural language processing. They are further ideally suited for cutting-edge hardware solutions of low cost, enabling nanoscale intelligence, ultralow energy consumption, energy harvesting, unrivalled inference speed, and competitive accuracy. In this tutorial, I cover the basics and recent advances of Tsetlin machines, including inference and learning, advanced architectures, and applications
17:30 Registration Ball Room A, University ClubBall
18:00 Reception Ball Room A, University ClubBall
Day 1 (Aug 29th 2024 )
Ball Room A, University Club
08:30 – 09:00 Registration
09:00 – 09:10 Welcome & Opening
09:10 – 10:10 Keynote I: Tsetlin Machines and the Stunning Logical Power of Human Minds. Selmer Bringsjord.
The human mind is not only — as many say — the smartest thing in the known universe; it’s specifically a logical marvel. We now know, for example, courtesy of reverse mathematics, that members of our species are able to reason rigorously over complex formal content expressed in third-order logic (TOL). (Some of the results of Gödel from long ago anticipated this.) Moreover, since some of us have knowledge of and beliefs about others who reason in such ways, our logical power extends to reasoning in robust “theory-of-mind” fashion. How can this be? How can a helpless infant whose brain is anemic rise to these logical heights in the span of a few short years? The answer has two intertwined parts: One, we have innate, i.e. unlearned, logical capacity. Two, this capacity supports learning of the kind that Tsetlin machines provide. This two-part answer is (a) diametrically at odds with such mistaken views as that statistical/numerical deep learning is the foundation of human-level logical power, and (b) explains why such agents as those in the GPT-k series are comically bad reasoners.
10:10 – 10:30 Break
10:30 – 12:00 Research Session 1
TM Literal Streaming with Absorbing States. Youmna Abdelwahab, Ole-Christoffer Granmo and Lei Jiao.
Sparse Tsetlin Machines (STMs) open up for assigning a small sample of the literals to each clause, and then permanently pruning those literals during learning through absorbing automata states. By trimming the STM in this manner, one can achieve a tenfold speed increase. However, when each clause is limited to the literal sample selected during clause creation, accuracy can drop. To reduce accuracy loss, we here introduce a scheme for streaming unallocated literals through the clauses. That is, the literals left out at clause creation are gradually introduced to the corresponding clauses during learning. Each time an absorbing state eliminates a literal from a clause, a new unallocated one is added to that clause, placed in an insertion state. We study the effectiveness of the scheme at various degrees of sampling, varying the absorbing and insertion states. In particular, we investigate the effect of incorporating unallocated literal during learning. Across several benchmark datasets, we observe a boost in accuracy with sampling rates 5% and 20%. However, without literal streaming, the accuracy drops markedly for sampling rates 1% to 4%, which confirms the positive effect literal streaming has on STMs. In conclusion, literal streaming makes the Tsetlin Machine more scalable, yielding higher accuracy with fewer resources.
Pruning Literals for Highly Efficient Explainability at Word Level. Rohan Yadav, Bimal Bhattarai, Abhik Jana, Lei Jiao and Seid Muhie Yimam.
Designing an explainable model becomes crucial now for Natural Language Processing~(NLP) since most of the state-of-the-art machine learning models provide a limited explanation for the prediction. In the spectrum of an explainable model, Tsetlin Machine (TM) is promising because of its capability of providing word-level explanation using proposition logic. However, concern rises over the elaborated combination of literals (propositional logic) in the clause that makes the model difficult for humans to comprehend, despite having a transparent learning process. In this paper, we design a post-hoc pruning of clauses that eliminate the randomly placed literals in the clause thereby making the model more efficiently interpretable than the vanilla TM. Experiments on the publicly available YELP-HAT Dataset demonstrate that the proposed pruned TM’s attention map aligns more with the human attention map than the vanilla TM’s attention map. In addition, the pairwise similarity measure also surpasses the attention map-based neural network models. In terms of accuracy, the proposed pruning method does not degrade the accuracy significantly but rather enhances the performance up to 4% to 9% in some test data.
Reasoning by Elimination: A Technique for Tsetlin Machines. Ahmed Kadhim, Ole-Christoffer Granmo, Lei Jiao and Rishad Shafik.
The Tsetlin Machine (TM) has gained significant attention in Machine Learning (ML). By employing logical fundamentals, it facilitates pattern learning and representation, offering an alternative approach for developing comprehensible Artificial Intelligence (AI) with a specific focus on pattern classification in the form of conjunctive clauses. In the domain of Natural Language Processing (NLP), TM is utilised to construct word embedding and describe target words using clauses. To enhance the descriptive capacity of these clauses, we study the concept of Reasoning by Elimination (RbE) in clauses’ formulation, which involves incorporating feature negations to provide a more comprehensive representation. In more detail, this paper employs the Tsetlin Machine Auto-Encoder (TM-AE) architecture to generate dense word vectors, aiming at capturing contextual information by extracting feature-dense vectors for a given vocabulary. Thereafter, the principle of RbE is explored to improve descriptivity and optimise the performance of the TM. Specifically, the specificity parameter s and the voting margin parameter T are leveraged to regulate feature distribution in the state space, resulting in a dense representation of information for each clause. In addition, we investigate the state spaces of TM-AE, especially for the forgotten/excluded features. Empirical investigations on artificially generated data, the IMDB dataset, and the 20 Newsgroups dataset showcase the robustness of the TM, with accuracy reaching 90.62% for the IMDB.
12:00 – 13:30 Lunch
13:30 – 15:00 Research Session 2
Accelerated Tsetlin Machine Inference Through Incremental Model Re-evaluation. Charul Giri, Ole Christoffer Granmo and Herke Van Hoof.
Tsetlin Machines (TMs) are a new class of machine learning algorithms that leverage propositional (Boolean) logic. While ensuring transparent and inherently interpretable decision-making, they handle relatively complex pattern recognition tasks, including classification, convolution, and regression. However, like most machine learning approaches, inference is computationally expensive for large models because each new input requires recalculating the model output from scratch. Slow evaluation can be problematic in time-critical tasks, hindering the deployment of more powerful models. This paper proposes a new TM inference approach that drastically reduces computational complexity through incremental model re-evaluation. To this end, we single out small incremental computations by tracing which clauses are impacted by each input feature. Our tailored solution for TM offers a more scalable and efficient inference strategy, particularly beneficial when new inputs are similar to previous ones. The results of our experiments on benchmark datasets demonstrate that our approach not only retains the same precision as the traditional TM but also provides significantly faster inference, achieving up to a 40 times speedup. The code is made available on GitHub.
Exploring Effects of Hyperdimensional Vectors for Tsetlin Machines. Vojtech Halenka, Ahmed Kadhim, Paul Clarke, Bimal Bhattarai, Rupsa Saha, Ole-Christoffer Granmo, Lei Jiao and Per-Arne Andersen.
Tsetlin Machines (TMs) have been successful in several application domains, operating with high efficiency on Boolean representations of the input data. However, Booleanizing complex data structures such as sequences, graphs, images, signal spectra, chemical compounds, and natural language is not trivial. In this paper, we propose a hypervector-based method for expressing arbitrarily large sets of concepts associated with any input data. Using a hyperdimensional space to build vectors drastically expands the capacity and flexibility of the TM. We demonstrate how images, chemical compounds, and natural language text are encoded according to the proposed method, and how the resulting hypervector-powered TM can achieve significantly higher accuracy and faster learning on well-known benchmarks. Our results open up a new research direction for TMs, namely how to expand and exploit the benefits of operating in hyperspace, including new Booleanization strategies, optimization of TM inference and learning, as well as new TM applications.
Pre-Sorted Tsetlin Machine (The Genetic K-Medoid Method). Jordan Morris and Alex Yakovlev.
This paper proposes a machine learning pre-sort stage to traditional supervised learning using Tsetlin Machines. Initially, K data-points are identified from the dataset using an expedited genetic algorithm to solve the maximum dispersion problem. These are then used as the initial placement to run the K-Medoid clustering algorithm. Finally, an expedited genetic algorithm is used to align K independent Tsetlin Machines by maximising hamming distance. For MNIST level classification problems, results demonstrate up to 10% improvement in accuracy, ∼383X reduction in training time and ∼99X reduction in inference time.
15:00 – 15:30 Break
15:30 – 17:00 Research Session 3
In-Memory Learning Automata Architecture using Y-Flash Cell. Omar Ghazal, Tian Lan, Shalman Ojukwu, Komal Krishnamurthy, Alex Yakovlev and Rishad Shafik.
The modern implementation of machine learning architectures faces significant challenges due to frequent data transfer between memory and processing units. In-memory computing, primarily through memristor-based analog computing, offers a promising solution to overcome this von Neumann bottleneck. In this technology, data processing and storage are located inside the memory. Here, we introduce a novel approach that utilizes floating-gate Y-Flash memristive devices manufactured with a standard 180 nm CMOS process. These devices offer attractive features, including analog tunability and moderate device-to-device variation; such characteristics are essential for reliable decision-making in ML applications. This paper uses a new machine learning algorithm, the Tsetlin Machine (TM), for in-memory processing architecture. The TM’s learning element, Automaton, is mapped into a single Y-Flash cell, where the Automaton’s range is transferred into the YFlash’s conductance scope. Through comprehensive simulations, the proposed hardware implementation of the learning automata, particularly for Tsetlin machines, has demonstrated enhanced scalability and on-edge learning capabilities.
Hardware Implementation of an Adaptive Finite State Machine Utilizing Tsetlin Machine. Yehuda Rudin, Osnat Keren, Michal Yemini and Alexander Fish.
In many applications, the deployed system is required to adjust to unpredictable changes in environments and real-time circumstances. Finite State Machine (FSM) is a computational model that is widely used for control in digital designs. In this study, we assume that the FSM behavioral model ought to adapt to a changing environment, whose characteristics are unknown in advance. To achieve this goal, instead of synthesizing the combinational logic from a predefined behavioral model, we utilize Tsetlin Machine (TM) design to construct the targeted logic functions through learning. The paper presents an approach to hardware implementation of a TM-based adaptive FSM that can be applied to ASIC. The implementation is validated and tested on an FPGA platform. The data collected from the measurements provides valuable insights into the learning process, such as the connection between the organization of the clauses that are formed by the TM and the resulting learning rate.
9T SRAM Cell based Wired-OR Logic Arrays for Tsetlin Machine Inference. Komal Krishnamurthy, Jesse Ojukwu, Shengyu Duan, Omar Ghazal, Alex Yakovlev and Rishad Shafik.
Tsetlin Machine (TM) has recently emerged as a promising alternative to arithmetically driven machine learning algorithms, such as deep neural networks (DNNs). TMs are based on single-layered propositional logic followed by summative voting between conjunctive clauses. Although, the logic underpinning has been demonstrated with significantly lower complexity than DNNs, TM’s memory footprint can grow dramatically when the model size increases. While designing hardware accelerator architecture, such an increase in the complexity can be associated with significantly increased data movement overheads. In this paper, we propose a processing in-memory (PIM) inspired TM inference architecture. Central to this architecture are 9-transistor (9T) static random access memory (SRAM) Wired-OR logic arrays, fitting the natural logic underpinning of TM. The implementation of Wired-OR logic using 9T SRAM cells provides the crucial embedding of logic with storage, thereby reducing the overall logic complexity and data movement costs significantly. Through extensive simulations, we analyze the function al properties of the inference accelerator. Further, we study scalability under multiple Tsetlin automata scenarios and investigate parametric behaviors such as power consumption, PVT variations and Monte Carlo simulations. We show that our design achieves 72% area reduction per TA propositional logic when compared with a vanilla CMOS design implementing the TAs using flip-flops.
18:00 Conference Banquete. The Oaklander, 5130 Bigelow Boulevard 10th Floor
Day 2 (Aug 30th 2024)
Ball Room A, University ClubBall
08:30 – 09:00 Registration
09:00 – 09:10 Opening & Announcements
09:10 – 10:10 Keynote II: Super-conducting Tsetlin Machines and Neuromorphic Computing. Dilip Vasudevan, Christoph Kirst.
Current trends in foundational models of AI have unleashed new challenges in system design to handle new generative AI applications. These systems when scaled will lead to highly memory intensive and communication congesting challenges which current Von-Neumann architectures cannot handle efficiently, leading to highly energy consuming systems. Alternative paradigms in computing, logic, architectures and devices are needed to tackle this energy crisis. Superconducting logic based systems are one of the promising venues to develop new directions to lower the energy consumption by several orders of magnitude.
In this presentation, we will take a look at recent new and promising computing paradigms developed using superconducting electronics (SCE) and their advantages towards energy efficiency and scalability. After introducing super-conducting technologies, we will present our new computing model called Super-Tsetlin, a Superconducting Tsetlin Machine designed using superconducting RSFQ technology and demonstrate some applications. We will then discuss the superconducting Temporal Design for a set of hard compute problems and their benefits. Finally, we will introduce innovative meromorphic computing frameworks for high-performance and energy efficient computations, including neuromorphic oscillator networks and their implementations and applications in superconducting technology.
We will conclude with a future vision towards building energy efficient systems for foundational models for AI and neuromorphic computing paradigms.
10:10 – 10:30 Break
10:30 – 12:00 Research Session 4
Time-Domain Argmax Architecture for Tsetlin Machine Classification. Tian Lan, Omar Ghazal, Alex Chan, Shalman Ojukwu, Rishad Shafik and Alex Yakovlev.
Machine Learning (ML) techniques have expanded the boundaries of artificial intelligence, especially in pattern recognition, automated decision-making, and data analysis. Nonetheless, the hardware implementation of maxima arguments (argmax), significant in ML algorithms, has increasingly become a limiting factor in edge computing. As the dataset grows, traditional argmax methods, including magnitude or Hamming comparators, face challenges related to excessive resource demands, slower response times, and higher power consumption. The Tsetlin machine (TM), as an emerging propositional logic-based ML algorithm, has interpretability and hardware affinity. This paper proposes a time-domain argmax architecture that is well-suited for the TM inference process. The architecture converts the arithmetic operations of input variables into delay accumulation. It grants the maximal variable by arbitrating the first-arrive signal based on the winner-take-all principle. The design flow is based on a four-phase “handshake” logic for quasi-delay insensitivity, where the signal transitions’ causality is modeled by a signal transition graph. The proposed design has been successfully validated on Cadence platforms utilizing TSMC 65nm technology, demonstrating a considerable size reduction compared to conventional HDL-based argmax circuits. Moreover, as the input data set expands, the design’s energy efficiency significantly improves.
Multi-Layer Tsetlin Machine: Architecture and Performance Evaluation. Olga Tarasyuk, Anatoliy Gorbenko, Rishad Shafik and Alex Yakovlev.
Tsetlin Machine (TM) is a recent automaton based algorithm for reinforcement learning. It has demonstrated competitive accuracy on many popular benchmarks while providing a natural interpretability. Due to its logical underpinning, it is amenable to hardware implementation with faster performance and higher energy efficiency than conventional Artificial Neural Networks. This paper introduces a multi-layer architecture of Tsetlin Machines with the aim to further boost TM performance via adoption of a hierarchical feature learning approach. This is seen as a way of creating hierarchical logic expressions from original Boolean literals, surpassing single-layer TMs in their ability to capture more complex patterns and high-level features. In this work we demonstrate that multi-layer TM considerably overperforms the single-layer TM architecture on several benchmarks while maintaining the ability to interpret its logic inference. However, it has also been shown that uncontrolled growth in the number of layers leads to overfitting.
An Optimized Toolbox for Advanced Image Processing with Tsetlin Machine Composites. Ylva Grønningsæter, Halvor S. Smørvik and Ole-Christoffer Granmo.
The Tsetlin Machine (TM) has achieved competitive results on several image classification benchmarks, including MNIST, K-MNIST, F-MNIST, and CIFAR-2. However, color image classification is arguably still in its infancy for TMs, with CIFAR-10 being a focal point for tracking progress. Over the past few years, TM’s CIFAR-10 accuracy has increased from around 61% in 2020 to 75.1% in 2023 with the introduction of Drop Clause. In this paper, we leverage the recently proposed TM Composites architecture and introduce a range of TM Specialists that use various image processing techniques. These include Canny edge detection, Histogram of Oriented Gradients, adaptive mean thresholding, adaptive Gaussian thresholding, Otsu’s thresholding, color thermometers, and adaptive color thermometers. In addition, we conduct a rigorous hyperparameter search, where we uncover optimal hyperparameters for several of the TM Specialists. The result is a toolbox that provides new state-of-the-art results on CIFAR-10 for TMs with an accuracy of 82.8%. In conclusion, our toolbox of TM Specialists forms a foundation for new TM applications and a landmark for further research on TM Composites in image analysis.
12:00 – 13:30 Lunch
13:30 – 14:30 Research Session 5
Generating Bayesian Network Models from Data Using Tsetlin Machines. Christian Blakely.
Bayesian networks (BNs) are directed acyclic graphical (DAG) models that have been adopted into many fields for their strengths in transparency, interpretability, probabilistic reasoning, and causal modeling. Given a set of data, one hurdle towards using BNs is in building the network graph from the data that properly handles dependencies, whether correlated or causal. In this paper, we propose an initial methodology for discovering network structures using Tsetlin Machines (TMs).
Using the Tsetlin Machine to Discover Fatality-based Transitions in Political Violence. Hsu-Chiang Hsu, Bree Bang-Jensen, Michael Colaresi, Panos Chrysanthis and Vladimir Zadorozhny.
There is considerable momentum across multiple continents to build and deploy systems that can usefully inform policy-makers and aid agencies about the timing, location, and scale of political violence. In this paper we introduce a framework that utilizes an architecture based on the Tsetlin Machine to detect notable transitions between lower and higher levels of political violence as measured by fatalities. We engineer input features based on the influential framework of horizontal inequalities (HI) and conflict history as they have been theorized to presage violence. However, we encounter the challenge that some classes are more or less distinguishable from others using TM-generated rules. We leverage the analytical structure of TMs and their weights to explain these clusters. To do so, we introduce the concept of a class spectrum and reduce the dimensions of these spectrums across classes and features with principal components analysis (PCA). This pipeline allows us to find transition (change) points between the fatality classes and cluster the classes into efficiently detectable groups. These tools provide both researchers and policy-makers a means of understanding similarities and differences in explanations across different conflict phases.
14: 30 – 15:00 Break
15:00 – 16:30 Poster Session
TMCoSim: Hardware-Software Co-Simulation Framework for Tsetlin Machines. Tousif Rahman, Gang Mao, Sidharth Maheshwari, Alex Chan, Marcos Sartori, Zhuang Shao, Rishad Shafik and Alex Yakovlev.
Tsetlin Machines have demonstrated competitive energy efficiency, latency, and frugal resource usage when realized in custom hardware for on-edge training applications. However, stochastic decision making within the Tsetlin Machine (TM) training process leads to difficulties in validating both functionality and learning behavior of the TM hardware. This has in large part prevented more TM hardware training implementations. This is due to the randomly influenced decisions made by TM’s feedback algorithm. This creates difficulties when tracking the full TM system state in hardware against a gold standard, as is the typical procedure in hardware verification flows. To address this challenge, this paper presents TM-CoSim, the first digital hardware-software co-simulation tool for validating the functionality and learning efficacy of Tsetlin Machine training RTL. The tool runs both a software golden standard and the user’s prospective RTL hardware concurrently. The random stimuli required for training can be provided either from the hardware or software processes such that the TM system state in both hardware and software can be verified over a training period. The tool’s primary objective is to facilitate faster development of TM training implementations in hardware with a framework supporting concurrent development of hardware and software. However, it also opens research opportunities towards better examination into the quality of hardware pseudo-random number generators as a function of TM training performance.
A Novel Tsetlin Machine with Enhanced Generalization. Usman Anjum and Justin Zhan.
Tsetlin machine (TM) is a novel machine learning approach that implements propositional logic to perform different tasks like classification and regression. The TM can not only achieves competitive accuracy in these tasks but it is also able to provide results that are explainable and easy to implement using simple hardware. The TM learns using clauses based off the features of the data. Final classification is done using a combination of the clauses. In this paper, we propose the novel idea of adding regularizers to TM, called as Regularized TM (RegTM), to improve generalization. Regularizers have been widely used in machine learning to improve accuracy. We explore different regularization strategies and how they would influence performance. We show the feasibility of our methodology by different experiments on benchmark datasets.
Detection of Microaneurysms from Color Fundus Images using Tsetlin Machine. Aditya Deepak Kumar, Rishad Shafik and Srinivas Boppu.
Diabetic retinopathy is a common complication of diabetes mellitus, which causes lesions on the retina that affect vision. If it is not detected early, it can lead to permanent blindness. The earliest manifestation of diabetic retinopathy is the presence of microaneurysms on the retina. Conventional diagnosis process involves manual scanning of retinal fundus images by ophthalmologists, which is laborious and costly. Moreover, due to the presence of unwanted artifacts, such scanning methods can be prone to misdiagnosis. In this paper, we have proposed a computationally automated detection of microaneurysms from color fundus images using a novel machine learning learning algorithm called Tsetlin Machine (TM). Tsetlin machine algorithm is built upon propositional logic and offers enhanced transparency and simplicity compared to current models such as artificial neural networks. In addition, its adaptability to hardware facilitates practical applications, rendering it highly valuable in various contexts. We have developed an end-to-end algorithmic pipeline, including data pre-processing and show that vanilla TM can achieve an accuracy of up to 88% on the chosen color fundus dataset. Furthermore, a hybrid convolutional neural network-Tsetlin machine model is proposed, which results in an improved accuracy of up to 94%.
Tree Based Interpretability Visualisation of The Tsetlin Machine. Shalman Ojukwu, Komal Krishnamurthy, Tian Lan, Omar Ghazal, Alex Yakovlev and Rishad Shafik.
In recent years, artificial intelligence and machine learning has emerged a stable solution to various industries and applications. Neural Networks are the current industry standard due to their high performance and versatile architectures. However, the black-box nature of their complex algorithms introduces challenges in understanding the training process and interpreting predictions especially in safety critical and high risk domains were inadequate performance can lead to server consequences. The Tsetlin Machine (TM) is an emerging algorithm that induces more comprehensibility into the learning procedure and highlights high interpretability in the pattern recognition process with simple propositional formulas. Although the TM offers feature-to-output interpretability, this highly depends on the booleanisation method. This paper proposes interpretability of the SOC estimation and proposes a new booleanization scheme that considers the feature distribution and still retains interpretability.
Enhancing Neural Network Learning Using Soft-Margin Taylor Softmax Propositional Network. Brandon Thompson, Usman Anjum and Justin Zhan.
The use of propositional logic-based statements for different machine learning tasks is becoming widespread and methods like Tsetlin Machines have shown how effective these methods can be across multiple tasks, like classification and regression. In this paper, we explore the prospect of using neural networks to learn from propositional logic statements. One of the major issues with using neural networks to learn from propositional logic statements is that the statements are non-differentiable. There are different strategies to solve this issue like adding a softmax layer. We show the viability of other alternative strategies to solve the issue of non-differentiable-ability and apply various techniques to classify objects using propositional logic statements. We run experiments on different datasets to show the effectiveness of these models.
16:30-17:00 Closing
17:30 – Tour of Nationality and Heritage Rooms
Visitor Center for the Nationality & Heritage Rooms, 1st Floor
of Cathedral of Learning