arxiv-updates icon indicating copy to clipboard operation
arxiv-updates copied to clipboard

New submissions for Thu, 16 Nov 23

Open zoq opened this issue 1 year ago • 0 comments

Keyword: sgd

Using Stochastic Gradient Descent to Smooth Nonconvex Functions: Analysis of Implicit Graduated Optimization with Optimal Noise Scheduling

  • Authors: Authors: Naoki Sato, Hideaki Iiduka
  • Subjects: Machine Learning (cs.LG); Optimization and Control (math.OC)
  • Arxiv link:
  • Pdf link:
  • Abstract The graduated optimization approach is a heuristic method for finding globally optimal solutions for nonconvex functions and has been theoretically analyzed in several studies. This paper defines a new family of nonconvex functions for graduated optimization, discusses their sufficient conditions, and provides a convergence analysis of the graduated optimization algorithm for them. It shows that stochastic gradient descent (SGD) with mini-batch stochastic gradients has the effect of smoothing the function, the degree of which is determined by the learning rate and batch size. This finding provides theoretical insights from a graduated optimization perspective on why large batch sizes fall into sharp local minima, why decaying learning rates and increasing batch sizes are superior to fixed learning rates and batch sizes, and what the optimal learning rate scheduling is. To the best of our knowledge, this is the first paper to provide a theoretical explanation for these aspects. Moreover, a new graduated optimization framework that uses a decaying learning rate and increasing batch size is analyzed and experimental results of image classification that support our theoretical findings are reported.

Keyword: optimization

MasterRTL: A Pre-Synthesis PPA Estimation Framework for Any RTL Design

  • Authors: Authors: Wenji Fang, Yao Lu, Shang Liu, Qijun Zhang, Ceyu Xu, Lisa Wu Wills, Hongce Zhang, Zhiyao Xie
  • Subjects: Hardware Architecture (cs.AR)
  • Arxiv link:
  • Pdf link:
  • Abstract In modern VLSI design flow, the register-transfer level (RTL) stage is a critical point, where designers define precise design behavior with hardware description languages (HDLs) like Verilog. Since the RTL design is in the format of HDL code, the standard way to evaluate its quality requires time-consuming subsequent synthesis steps with EDA tools. This time-consuming process significantly impedes design optimization at the early RTL stage. Despite the emergence of some recent ML-based solutions, they fail to maintain high accuracy for any given RTL design. In this work, we propose an innovative pre-synthesis PPA estimation framework named MasterRTL. It first converts the HDL code to a new bit-level design representation named the simple operator graph (SOG). By only adopting single-bit simple operators, this SOG proves to be a general representation that unifies different design types and styles. The SOG is also more similar to the target gate-level netlist, reducing the gap between RTL representation and netlist. In addition to the new SOG representation, MasterRTL proposes new ML methods for the RTL-stage modeling of timing, power, and area separately. Compared with state-of-the-art solutions, the experiment on a comprehensive dataset with 90 different designs shows accuracy improvement by 0.33, 0.22, and 0.15 in correlation for total negative slack (TNS), worst negative slack (WNS), and power, respectively.

Real-time topology optimization via learnable mappings

  • Authors: Authors: Gabriel Garayalde, Matteo Torzoni, Matteo Bruggi, Alberto Corigliano
  • Subjects: Computational Engineering, Finance, and Science (cs.CE)
  • Arxiv link:
  • Pdf link:
  • Abstract In traditional topology optimization, the computing time required to iteratively update the material distribution within a design domain strongly depends on the complexity or size of the problem, limiting its application in real engineering contexts. This work proposes a multi-stage machine learning strategy that aims to predict an optimal topology and the related stress fields of interest, either in 2D or 3D, without resorting to any iterative analysis and design process. The overall topology optimization is treated as regression task in a low-dimensional latent space, that encodes the variability of the target designs. First, a fully-connected model is employed to surrogate the functional link between the parametric input space characterizing the design problem and the latent space representation of the corresponding optimal topology. The decoder branch of an autoencoder is then exploited to reconstruct the desired optimal topology from its latent representation. The deep learning models are trained on a dataset generated through a standard method of topology optimization implementing the solid isotropic material with penalization, for varying boundary and loading conditions. The underlying hypothesis behind the proposed strategy is that optimal topologies share enough common patterns to be compressed into small latent space representations without significant information loss. Results relevant to a 2D Messerschmitt-B"olkow-Blohm beam and a 3D bridge case demonstrate the capabilities of the proposed framework to provide accurate optimal topology predictions in a fraction of a second.

Robust Differentiable Predictive Control with Safety Guarantees: A Predictive Safety Filter Approach

  • Authors: Authors: Wenceslao Shaw Cortez, Jan Drgona, Draguna Vrabie, Mahantesh Halappanavar
  • Subjects: Systems and Control (eess.SY)
  • Arxiv link:
  • Pdf link:
  • Abstract In this paper, we propose a novel predictive safety filter that is robust to bounded perturbations and is combined with a learning-based control called differentiable predictive control (DPC). The proposed method provides rigorous guarantees of safety in the presence of bounded perturbations and implements DPC so long as the DPC control satisfies the system constraints. The approach also incorporates two forms of event-triggering to reduce online computation. The approach is comprised of a robust predictive safety filter that extends upon existing work to reject disturbances for discrete-time, time-varying nonlinear systems with time-varying constraints. The safety filter is based on novel concepts of robust, discrete-time barrier functions and can be used to filter any control law. Here we use the safety filter in conjunction with DPC as a promising policy optimization method. The approach is demonstrated on a single-integrator, two-tank system, and building example.

MOSAIC: A Multi-Objective Optimization Framework for Sustainable Datacenter Management

  • Authors: Authors: Sirui Qi, Dejan Milojicic, Cullen Bash, Sudeep Pasricha
  • Subjects: Distributed, Parallel, and Cluster Computing (cs.DC); Neural and Evolutionary Computing (cs.NE)
  • Arxiv link:
  • Pdf link:
  • Abstract In recent years, cloud service providers have been building and hosting datacenters across multiple geographical locations to provide robust services. However, the geographical distribution of datacenters introduces growing pressure to both local and global environments, particularly when it comes to water usage and carbon emissions. Unfortunately, efforts to reduce the environmental impact of such datacenters often lead to an increase in the cost of datacenter operations. To co-optimize the energy cost, carbon emissions, and water footprint of datacenter operation from a global perspective, we propose a novel framework for multi-objective sustainable datacenter management (MOSAIC) that integrates adaptive local search with a collaborative decomposition-based evolutionary algorithm to intelligently manage geographical workload distribution and datacenter operations. Our framework sustainably allocates workloads to datacenters while taking into account multiple geography- and time-based factors including renewable energy sources, variable energy costs, power usage efficiency, carbon factors, and water intensity in energy. Our experimental results show that, compared to the best-known prior work frameworks, MOSAIC can achieve 27.45x speedup and 1.53x improvement in Pareto Hypervolume while reducing the carbon footprint by up to 1.33x, water footprint by up to 3.09x, and energy costs by up to 1.40x. In the simultaneous three-objective co-optimization scenario, MOSAIC achieves a cumulative improvement across all objectives (carbon, water, cost) of up to 4.61x compared to the state-of-the-arts.

Multi-Radar Inertial Odometry for 3D State Estimation using mmWave Imaging Radar

  • Authors: Authors: Jui-Te Huang, Ruoyang Xu, Akshay Hinduja, Michael Kaess
  • Subjects: Robotics (cs.RO)
  • Arxiv link:
  • Pdf link:
  • Abstract State estimation is a crucial component for the successful implementation of robotic systems, relying on sensors such as cameras, LiDAR, and IMUs. However, in real-world scenarios, the performance of these sensors is degraded by challenging environments, e.g. adverse weather conditions and low-light scenarios. The emerging 4D imaging radar technology is capable of providing robust perception in adverse conditions. Despite its potential, challenges remain for indoor settings where noisy radar data does not present clear geometric features. Moreover, disparities in radar data resolution and field of view (FOV) can lead to inaccurate measurements. While prior research has explored radar-inertial odometry based on Doppler velocity information, challenges remain for the estimation of 3D motion because of the discrepancy in the FOV and resolution of the radar sensor. In this paper, we address Doppler velocity measurement uncertainties. We present a method to optimize body frame velocity while managing Doppler velocity uncertainty. Based on our observations, we propose a dual imaging radar configuration to mitigate the challenge of discrepancy in radar data. To attain high-precision 3D state estimation, we introduce a strategy that seamlessly integrates radar data with a consumer-grade IMU sensor using fixed-lag smoothing optimization. Finally, we evaluate our approach using real-world 3D motion data.

Coreset Selection with Prioritized Multiple Objectives

  • Authors: Authors: Xiaobo Xia, Jiale Liu, Shaokun Zhang, Qingyun Wu, Tongliang Liu
  • Subjects: Machine Learning (cs.LG)
  • Arxiv link:
  • Pdf link:
  • Abstract Coreset selection is powerful in reducing computational costs and accelerating data processing for deep learning algorithms. It strives to identify a small subset from large-scale data, so that training only on the subset practically performs on par with full data. When coreset selection is applied in realistic scenes, under the premise that the identified coreset has achieved comparable model performance, practitioners regularly desire the identified coreset can have a size as small as possible for lower costs and greater acceleration. Motivated by this desideratum, for the first time, we pose the problem of "coreset selection with prioritized multiple objectives", in which the smallest coreset size under model performance constraints is explored. Moreover, to address this problem, an innovative method is proposed, which maintains optimization priority order over the model performance and coreset size, and efficiently optimizes them in the coreset selection procedure. Theoretically, we provide the convergence guarantee of the proposed method. Empirically, extensive experiments confirm its superiority compared with previous strategies, often yielding better model performance with smaller coreset sizes.

Federated Learning for Sparse Principal Component Analysis

  • Authors: Authors: Sin Cheng Ciou, Pin Jui Chen, Elvin Y. Tseng, Yuh-Jye Lee
  • Subjects: Machine Learning (cs.LG); Distributed, Parallel, and Cluster Computing (cs.DC); Information Theory (cs.IT); Machine Learning (stat.ML)
  • Arxiv link:
  • Pdf link:
  • Abstract In the rapidly evolving realm of machine learning, algorithm effectiveness often faces limitations due to data quality and availability. Traditional approaches grapple with data sharing due to legal and privacy concerns. The federated learning framework addresses this challenge. Federated learning is a decentralized approach where model training occurs on client sides, preserving privacy by keeping data localized. Instead of sending raw data to a central server, only model updates are exchanged, enhancing data security. We apply this framework to Sparse Principal Component Analysis (SPCA) in this work. SPCA aims to attain sparse component loadings while maximizing data variance for improved interpretability. Beside the L1 norm regularization term in conventional SPCA, we add a smoothing function to facilitate gradient-based optimization methods. Moreover, in order to improve computational efficiency, we introduce a least squares approximation to original SPCA. This enables analytic solutions on the optimization processes, leading to substantial computational improvements. Within the federated framework, we formulate SPCA as a consensus optimization problem, which can be solved using the Alternating Direction Method of Multipliers (ADMM). Our extensive experiments involve both IID and non-IID random features across various data owners. Results on synthetic and public datasets affirm the efficacy of our federated SPCA approach.

System-Wide Emergency Policy for Transitioning from Main to Secondary Fuel

  • Authors: Authors: Laurent Pagnier, Igal Goldshtein, Criston Hyett, Robert Ferrando, Jean Alisse, Lilah Saban, Michael Chertkov
  • Subjects: Systems and Control (eess.SY)
  • Arxiv link:
  • Pdf link:
  • Abstract Inspired by the challenges of running the Israel's power system -- with its increasing integration of renewables, significant load uncertainty, and primary reliance on natural gas -- we investigate an emergency scenario where there's a need to transition temporarily to a pricier secondary fuel until the emergency resolves. Our objective is to devise tools that can assist power system operators in making decisions during such critical periods. We frame this challenge as a Markov Decision Process (MDP) optimization, considering uncertainties like potential failures of dual-fuel generators during the transition, operator attentiveness under stress, available but finite amount of primary fuel (linepack available in the natural gas part of the system), power forecast (net demand after renewable production), and the cost implications of unavoidable load shedding. By solving the MDP in a simplified context, we identify viable policies through simulations of multiple parametrized Markov Processes (MPs). We verify our methodology using a realistic open-source model replicating Israel's power-gas infrastructure and outline next steps for refining and adapting this approach.

Joint User Pairing and Beamforming Design of Multi-STAR-RISs-Aided NOMA in the Indoor Environment via Multi-Agent Reinforcement Learning

  • Authors: Authors: Yu Min Park, Yan Kyaw Tun, Choong Seon Hong
  • Subjects: Information Theory (cs.IT); Artificial Intelligence (cs.AI); Networking and Internet Architecture (cs.NI)
  • Arxiv link:
  • Pdf link:
  • Abstract The development of 6G/B5G wireless networks, which have requirements that go beyond current 5G networks, is gaining interest from academic and industrial. However, to increase 6G/B5G network quality, conventional cellular networks that rely on terrestrial base stations are constrained geographically and economically. Meanwhile, NOMA allows multiple users to share the same resources, which improves the spectral efficiency of the system and has the advantage of supporting a larger number of users. Additionally, by intelligently manipulating the phase and amplitude of both the reflected and transmitted signals, STAR-RISs can achieve improved coverage, increased spectral efficiency, and enhanced communication reliability. However, STAR-RISs must simultaneously optimize the Amplitude and Phase-shift corresponding to reflection and transmission, which makes the existing terrestiral networks more complicated and is considered a major challenging issue. Motivated by the above, we study the joint user pairing for NOMA and beamforming design of Multi-STAR-RISs in an indoor environment. Then, we formulate the optimization problem with the objective of maximizing the total throughput of MUs by jointly optimizing the decoding order, user pairing, active beamforming, and passive beamforming. However, the formulated problem is a MINLP. To tackle this challenge, we first introduce the decoding order for NOMA networks. Next, we decompose the original problem into two subproblems namely: 1) MU pairing and 2) Beamforming optimization under the optimal decoding order. For the first subproblem, we employ correlation-based K-means clustering to solve the user pairing problem. Then, to jointly deal with beamforming vector optimizations, we propose MAPPO, which can make quick decisions in the given environment owing to its low complexity.

Optimal Placement of Capacitor in Distribution System using Particle Swarm Optimization

  • Authors: Authors: Izhar Ul Haq (School of Automation, Central South University, China)
  • Subjects: Systems and Control (eess.SY); Neural and Evolutionary Computing (cs.NE)
  • Arxiv link:
  • Pdf link:
  • Abstract In power systems, the incorporation of capacitors offers a wide range of established advantages. These benefits encompass the enhancement of the systems power factor, optimization of voltage profiles, increased capacity for current flow through cables and transformers, and the mitigation of losses attributed to the compensation of reactive power components. Different techniques have been applied to enhance the performance of the distribution system by reducing line losses. This paper focuses on reducing line losses through the optimal placement and sizing of capacitors. Optimal capacitor placement is analysed using load flow analysis with the Newton Raphson method. The placement of capacitor optimization is related to the sensitivity of the buses, which depends on the loss sensitivity factor. The optimal capacitor size is determined using Particle Swarm Optimization (PSO). The analysis is conducted using the IEEE 14 bus system in MATLAB. The results reveal that placing capacitors at the most sensitive bus locations leads to a significant reduction in line losses. Additionally, the optimal capacitor size has a substantial impact on improving the voltage profile and the power loss is reduced by 21.02 percent through the proposed method.

Using Stochastic Gradient Descent to Smooth Nonconvex Functions: Analysis of Implicit Graduated Optimization with Optimal Noise Scheduling

  • Authors: Authors: Naoki Sato, Hideaki Iiduka
  • Subjects: Machine Learning (cs.LG); Optimization and Control (math.OC)
  • Arxiv link:
  • Pdf link:
  • Abstract The graduated optimization approach is a heuristic method for finding globally optimal solutions for nonconvex functions and has been theoretically analyzed in several studies. This paper defines a new family of nonconvex functions for graduated optimization, discusses their sufficient conditions, and provides a convergence analysis of the graduated optimization algorithm for them. It shows that stochastic gradient descent (SGD) with mini-batch stochastic gradients has the effect of smoothing the function, the degree of which is determined by the learning rate and batch size. This finding provides theoretical insights from a graduated optimization perspective on why large batch sizes fall into sharp local minima, why decaying learning rates and increasing batch sizes are superior to fixed learning rates and batch sizes, and what the optimal learning rate scheduling is. To the best of our knowledge, this is the first paper to provide a theoretical explanation for these aspects. Moreover, a new graduated optimization framework that uses a decaying learning rate and increasing batch size is analyzed and experimental results of image classification that support our theoretical findings are reported.

Accelerating Toeplitz Neural Network with Constant-time Inference Complexity

  • Authors: Authors: Zhen Qin, Yiran Zhong
  • Subjects: Computation and Language (cs.CL)
  • Arxiv link:
  • Pdf link:
  • Abstract Toeplitz Neural Networks (TNNs) have exhibited outstanding performance in various sequence modeling tasks. They outperform commonly used Transformer-based models while benefiting from log-linear space-time complexities. On the other hand, State Space Models (SSMs) achieve lower performance than TNNs in language modeling but offer the advantage of constant inference complexity. In this paper, we aim to combine the strengths of TNNs and SSMs by converting TNNs to SSMs during inference, thereby enabling TNNs to achieve the same constant inference complexities as SSMs. To accomplish this, we formulate the conversion process as an optimization problem and provide a closed-form solution. We demonstrate how to transform the target equation into a Vandermonde linear system problem, which can be efficiently solved using the Discrete Fourier Transform (DFT). Notably, our method requires no training and maintains numerical stability. It can be also applied to any LongConv-based model. To assess its effectiveness, we conduct extensive experiments on language modeling tasks across various settings. Additionally, we compare our method to other gradient-descent solutions, highlighting the superior numerical stability of our approach. The source code is available at

Frequency Domain-based Dataset Distillation

  • Authors: Authors: Donghyeok Shin, Seungjae Shin, Il-Chul Moon
  • Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV)
  • Arxiv link:
  • Pdf link:
  • Abstract This paper presents FreD, a novel parameterization method for dataset distillation, which utilizes the frequency domain to distill a small-sized synthetic dataset from a large-sized original dataset. Unlike conventional approaches that focus on the spatial domain, FreD employs frequency-based transforms to optimize the frequency representations of each data instance. By leveraging the concentration of spatial domain information on specific frequency components, FreD intelligently selects a subset of frequency dimensions for optimization, leading to a significant reduction in the required budget for synthesizing an instance. Through the selection of frequency dimensions based on the explained variance, FreD demonstrates both theoretical and empirical evidence of its ability to operate efficiently within a limited budget, while better preserving the information of the original dataset compared to conventional parameterization methods. Furthermore, based on the orthogonal compatibility of FreD with existing methods, we confirm that FreD consistently improves the performances of existing distillation methods over the evaluation scenarios with different benchmark datasets. We release the code at

A* search algorithm for an optimal investment problem in vehicle-sharing systems

  • Authors: Authors: Ba Luat Le, Layla Martin, Emrah Demir, Duc Minh Vu
  • Subjects: Artificial Intelligence (cs.AI)
  • Arxiv link:
  • Pdf link:
  • Abstract We study an optimal investment problem that arises in the context of the vehicle-sharing system. Given a set of locations to build stations, we need to determine i) the sequence of stations to be built and the number of vehicles to acquire in order to obtain the target state where all stations are built, and ii) the number of vehicles to acquire and their allocation in order to maximize the total profit returned by operating the system when some or all stations are open. The profitability associated with operating open stations, measured over a specific time period, is represented as a linear optimization problem applied to a collection of open stations. With operating capital, the owner of the system can open new stations. This property introduces a set-dependent aspect to the duration required for opening a new station, and the optimal investment problem can be viewed as a variant of the Traveling Salesman Problem (TSP) with set-dependent cost. We propose an A* search algorithm to address this particular variant of the TSP. Computational experiments highlight the benefits of the proposed algorithm in comparison to the widely recognized Dijkstra algorithm and propose future research to explore new possibilities and applications for both exact and approximate A* algorithms.

An MRL-Based Design Solution for RIS-Assisted MU-MIMO Wireless System under Time-Varying Channels

  • Authors: Authors: Meng-Qian Alexander Wu, Tzu-Hsien Sang, Luisa Schuhmacher, Ming-Jie Guo, Khodr Hammoud, Sofie Pollin
  • Subjects: Systems and Control (eess.SY)
  • Arxiv link:
  • Pdf link:
  • Abstract Utilizing Deep Reinforcement Learning (DRL) for Reconfigurable Intelligent Surface (RIS) assisted wireless communication has been extensively researched. However, existing DRL methods either act as a simple optimizer or only solve problems with concurrent Channel State Information (CSI) represented in the training data set. Consequently, solutions for RIS-assisted wireless communication systems under time-varying environments are relatively unexplored. However, communication problems should be considered with realistic assumptions; for instance, in scenarios where the channel is time-varying, the policy obtained by reinforcement learning should be applicable for situations where CSI is not well represented in the training data set. In this paper, we apply Meta-Reinforcement Learning (MRL) to the joint optimization problem of active beamforming at the Base Station (BS) and phase shift at the RIS, motivated by MRL's ability to extend the DRL concept of solving one Markov Decision Problem (MDP) to multiple MDPs. We provide simulation results to compare the average sum rate of the proposed approach with those of selected forerunners in the literature. Our approach improves the sum rate by more than 60% under time-varying CSI assumption while maintaining the advantages of typical DRL-based solutions. Our study's results emphasize the possibility of utilizing MRL-based designs in RIS-assisted wireless communication systems while considering realistic environment assumptions.

Formal Verification of Zero-Knowledge Circuits

  • Authors: Authors: Alessandro Coglio (Kestrel Institute and Aleo Systems Inc.), Eric McCarthy (Kestrel Institute and Aleo Systems Inc.), Eric W. Smith (Kestrel Institute)
  • Subjects: Logic in Computer Science (cs.LO); Cryptography and Security (cs.CR); Symbolic Computation (cs.SC)
  • Arxiv link:
  • Pdf link:
  • Abstract Zero-knowledge circuits are sets of equality constraints over arithmetic expressions interpreted in a prime field; they are used to encode computations in cryptographic zero-knowledge proofs. We make the following contributions to the problem of ensuring that a circuit correctly encodes a computation: a formal framework for circuit correctness; an ACL2 library for prime fields; an ACL2 model of the existing R1CS (Rank-1 Constraint Systems) formalism to represent circuits, along with ACL2 and Axe tools to verify circuits of this form; a novel PFCS (Prime Field Constraint Systems) formalism to represent hierarchically structured circuits, along with an ACL2 model of it and ACL2 tools to verify circuits of this form in a compositional and scalable way; verification of circuits, ranging from simple to complex; and discovery of bugs and optimizations in existing zero-knowledge systems.

ACL2 Proofs of Nonlinear Inequalities with Imandra

  • Authors: Authors: Grant Passmore (Imandra Inc.)
  • Subjects: Logic in Computer Science (cs.LO); Symbolic Computation (cs.SC)
  • Arxiv link:
  • Pdf link:
  • Abstract We present a proof-producing integration of ACL2 and Imandra for proving nonlinear inequalities. This leverages a new Imandra interface exposing its nonlinear decision procedures. The reasoning takes place over the reals, but the proofs produced are valid over the rationals and may be run in both ACL2 and ACL2(r). The ACL2 proofs Imandra constructs are extracted from Positivstellensatz refutations, a real algebraic analogue of the Nullstellensatz, and are found using convex optimization.

Verification of a Rust Implementation of Knuth's Dancing Links using ACL2

  • Authors: Authors: David S. Hardin
  • Subjects: Logic in Computer Science (cs.LO); Data Structures and Algorithms (cs.DS); Programming Languages (cs.PL)
  • Arxiv link:
  • Pdf link:
  • Abstract Dancing Links connotes an optimization to a circular doubly-linked list data structure implementation which provides for fast list element removal and restoration. The Dancing Links optimization is used primarily in fast algorithms to find exact covers, and has been popularized by Knuth in Volume 4B of his seminal series The Art of Computer Programming. We describe an implementation of the Dancing Links optimization in the Rust programming language, as well as its formal verification using the ACL2 theorem prover. Rust has garnered significant endorsement in the past few years as a modern, memory-safe successor to C/C++ at companies such as Amazon, Google, and Microsoft, and is being integrated into both the Linux and Windows operating system kernels. Our interest in Rust stems from its potential as a hardware/software co-assurance language, with application to critical systems. We have crafted a Rust subset, inspired by Russinoff's Restricted Algorithmic C (RAC), which we have imaginatively named Restricted Algorithmic Rust, or RAR. In previous work, we described our initial implementation of a RAR toolchain, wherein we simply transpile the RAR source into RAC. By so doing, we leverage a number of existing hardware/software co-assurance tools with a minimum investment of time and effort. In this paper, we describe the RAR Rust subset, describe our improved prototype RAR toolchain, and detail the design and verification of a circular doubly-linked list data structure employing the Dancing Links optimization in RAR, with full proofs of functional correctness accomplished using the ACL2 theorem prover.

One-Shot Federated Learning with Classifier-Guided Diffusion Models

  • Authors: Authors: Mingzhao Yang, Shangchao Su, Bin Li, Xiangyang Xue
  • Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
  • Arxiv link:
  • Pdf link:
  • Abstract One-shot federated learning (OSFL) has gained attention in recent years due to its low communication cost. However, most of the existing methods require auxiliary datasets or training generators, which hinders their practicality in real-world scenarios. In this paper, we explore the novel opportunities that diffusion models bring to OSFL and propose FedCADO, utilizing guidance from client classifiers to generate data that complies with clients' distributions and subsequently training the aggregated model on the server. Specifically, our method involves targeted optimizations in two aspects. On one hand, we conditionally edit the randomly sampled initial noises, embedding them with specified semantics and distributions, resulting in a significant improvement in both the quality and stability of generation. On the other hand, we employ the BN statistics from the classifiers to provide detailed guidance during generation. These tailored optimizations enable us to limitlessly generate datasets, which closely resemble the distribution and quality of the original client dataset. Our method effectively handles the heterogeneous client models and the problems of non-IID features or labels. In terms of privacy protection, our method avoids training any generator or transferring any auxiliary information on clients, eliminating any additional privacy leakage risks. Leveraging the extensive knowledge stored in the pre-trained diffusion model, the synthetic datasets can assist us in surpassing the knowledge limitations of the client samples, resulting in aggregation models that even outperform the performance ceiling of centralized training in some cases, which is convincingly demonstrated in the sufficient quantification and visualization experiments conducted on three large-scale multi-domain image datasets.

Reducing 2-QuBit Gate Count for ZX-Calculus based Quantum Circuit Optimization

  • Authors: Authors: Korbinian Staudacher (Ludwig-Maximilians-Universität München), Tobias Guggemos (Ludwig-Maximilians-Universität München), Sophia Grundner-Culemann (Ludwig-Maximilians-Universität München), Wolfgang Gehrke (Universität der Bundeswehr München)
  • Subjects: Computational Engineering, Finance, and Science (cs.CE)
  • Arxiv link:
  • Pdf link:
  • Abstract In the near term, programming quantum computers will remain severely limited by low quantum volumes. Therefore, it is desirable to implement quantum circuits with the fewest resources possible. For the common Clifford+T circuits, most research is focused on reducing the number of T gates, since they are an order of magnitude more expensive than Clifford gates in quantum error corrected encoding schemes. However, this optimization sometimes leads to more 2-qubit gates, which, even though they are less expensive in terms of fault-tolerance, contribute significantly to the overall circuit cost. Approaches based on the ZX-calculus have recently gained some popularity in the field, but reduction of 2-qubit gates is not their focus. In this work, we present an alternative for improving 2-qubit gate count of a quantum circuit with the ZX-calculus by using heuristics in ZX-diagram simplification. Our approach maintains the good reduction of the T gate count provided by other strategies based on ZX-calculus, thus serving as an extension for other optimization algorithms. Our results show that combining the available ZX-calculus-based optimizations with our algorithms can reduce the number of 2-qubit gates by as much as 40% compared to current approaches using ZX-calculus. Additionally, we improve the results of the best currently available optimization technique of Nam et. al for some circuits by up to 15%.

DLAS: An Exploration and Assessment of the Deep Learning Acceleration Stack

  • Authors: Authors: Perry Gibson, José Cano, Elliot J. Crowley, Amos Storkey, Michael O'Boyle
  • Subjects: Machine Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV); Performance (cs.PF)
  • Arxiv link:
  • Pdf link:
  • Abstract Deep Neural Networks (DNNs) are extremely computationally demanding, which presents a large barrier to their deployment on resource-constrained devices. Since such devices are where many emerging deep learning applications lie (e.g., drones, vision-based medical technology), significant bodies of work from both the machine learning and systems communities have attempted to provide optimizations to accelerate DNNs. To help unify these two perspectives, in this paper we combine machine learning and systems techniques within the Deep Learning Acceleration Stack (DLAS), and demonstrate how these layers can be tightly dependent on each other with an across-stack perturbation study. We evaluate the impact on accuracy and inference time when varying different parameters of DLAS across two datasets, seven popular DNN architectures, four DNN compression techniques, three algorithmic primitives with sparse and dense variants, untuned and auto-scheduled code generation, and four hardware platforms. Our evaluation highlights how perturbations across DLAS parameters can cause significant variation and across-stack interactions. The highest level observation from our evaluation is that the model size, accuracy, and inference time are not guaranteed to be correlated. Overall we make 13 key observations, including that speedups provided by compression techniques are very hardware dependent, and that compiler auto-tuning can significantly alter what the best algorithm to use for a given configuration is. With DLAS, we aim to provide a reference framework to aid machine learning and systems practitioners in reasoning about the context in which their respective DNN acceleration solutions exist in. With our evaluation strongly motivating the need for co-design, we believe that DLAS can be a valuable concept for exploring the next generation of co-designed accelerated deep learning solutions.

Combining Shamir & Additive Secret Sharing to Improve Efficiency of SMC Primitives Against Malicious Adversaries

  • Authors: Authors: Kenneth Goss
  • Subjects: Cryptography and Security (cs.CR)
  • Arxiv link:
  • Pdf link:
  • Abstract Secure multi-party computation provides a wide array of protocols for mutually distrustful parties be able to securely evaluate functions of private inputs. Within recent years, many such protocols have been proposed representing a plethora of strategies to securely and efficiently handle such computation. These protocols have become increasingly efficient, but their performance still is impractical in many settings. We propose new approaches to some of these problems which are either more efficient than previous works within the same security models or offer better security guarantees with comparable efficiency. The goals of this research are to improve efficiency and security of secure multi-party protocols and explore the application of such approaches to novel threat scenarios. Some of the novel optimizations employed are dynamically switching domains of shared secrets, asymmetric computations, and advantageous functional transformations, among others. Specifically, this work presents a novel combination of Shamir and Additive secret sharing to be used in parallel which allows for the transformation of efficient protocols secure against passive adversaries to be secure against active adversaries. From this set of primitives we propose the construction of a comparison protocol which can be implemented under that approach with a complexity which is more efficient than other recent works for common domains of interest. Finally, we present a system which addresses a critical security threat for the protection and obfuscation of information which may be of high consequence.

Supported Trust Region Optimization for Offline Reinforcement Learning

  • Authors: Authors: Yixiu Mao, Hongchang Zhang, Chen Chen, Yi Xu, Xiangyang Ji
  • Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI)
  • Arxiv link:
  • Pdf link:
  • Abstract Offline reinforcement learning suffers from the out-of-distribution issue and extrapolation error. Most policy constraint methods regularize the density of the trained policy towards the behavior policy, which is too restrictive in most cases. We propose Supported Trust Region optimization (STR) which performs trust region policy optimization with the policy constrained within the support of the behavior policy, enjoying the less restrictive support constraint. We show that, when assuming no approximation and sampling error, STR guarantees strict policy improvement until convergence to the optimal support-constrained policy in the dataset. Further with both errors incorporated, STR still guarantees safe policy improvement for each step. Empirical results validate the theory of STR and demonstrate its state-of-the-art performance on MuJoCo locomotion domains and much more challenging AntMaze domains.

A Spectral Diffusion Prior for Hyperspectral Image Super-Resolution

  • Authors: Authors: Jianjun Liu, Zebin Wu, Liang Xiao
  • Subjects: Computer Vision and Pattern Recognition (cs.CV); Image and Video Processing (eess.IV)
  • Arxiv link:
  • Pdf link:
  • Abstract Fusion-based hyperspectral image (HSI) super-resolution aims to produce a high-spatial-resolution HSI by fusing a low-spatial-resolution HSI and a high-spatial-resolution multispectral image. Such a HSI super-resolution process can be modeled as an inverse problem, where the prior knowledge is essential for obtaining the desired solution. Motivated by the success of diffusion models, we propose a novel spectral diffusion prior for fusion-based HSI super-resolution. Specifically, we first investigate the spectrum generation problem and design a spectral diffusion model to model the spectral data distribution. Then, in the framework of maximum a posteriori, we keep the transition information between every two neighboring states during the reverse generative process, and thereby embed the knowledge of trained spectral diffusion model into the fusion problem in the form of a regularization term. At last, we treat each generation step of the final optimization problem as its subproblem, and employ the Adam to solve these subproblems in a reverse sequence. Experimental results conducted on both synthetic and real datasets demonstrate the effectiveness of the proposed approach. The code of the proposed approach will be available on

Unsupervised approaches based on optimal transport and convex analysis for inverse problems in imaging

  • Authors: Authors: Marcello Carioni, Subhadip Mukherjee, Hong Ye Tan, Junqi Tang
  • Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG); Optimization and Control (math.OC)
  • Arxiv link:
  • Pdf link:
  • Abstract Unsupervised deep learning approaches have recently become one of the crucial research areas in imaging owing to their ability to learn expressive and powerful reconstruction operators even when paired high-quality training data is scarcely available. In this chapter, we review theoretically principled unsupervised learning schemes for solving imaging inverse problems, with a particular focus on methods rooted in optimal transport and convex analysis. We begin by reviewing the optimal transport-based unsupervised approaches such as the cycle-consistency-based models and learned adversarial regularization methods, which have clear probabilistic interpretations. Subsequently, we give an overview of a recent line of works on provably convergent learned optimization algorithms applied to accelerate the solution of imaging inverse problems, alongside their dedicated unsupervised training schemes. We also survey a number of provably convergent plug-and-play algorithms (based on gradient-step deep denoisers), which are among the most important and widely applied unsupervised approaches for imaging problems. At the end of this survey, we provide an overview of a few related unsupervised learning frameworks that complement our focused schemes. Together with a detailed survey, we provide an overview of the key mathematical results that underlie the methods reviewed in the chapter to keep our discussion self-contained.

Edge Accelerated Robot Navigation with Hierarchical Motion Planning

  • Authors: Authors: Guoliang Li, Ruihua Han, Shuai Wang, Fei Gao, Yonina C. Eldar, Chengzhong Xu
  • Subjects: Robotics (cs.RO)
  • Arxiv link:
  • Pdf link:
  • Abstract Low-cost autonomous robots suffer from limited onboard computing power, resulting in excessive computation time when navigating in cluttered environments. This paper presents Edge Accelerated Robot Navigation, or EARN for short, to achieve real-time collision avoidance by adopting hierarchical motion planning (HMP). In contrast to existing local or edge motion planning solutions that ignore the interdependency between low-level motion planning and high-level resource allocation, EARN adopts model predictive switching (MPS) that maximizes the expected switching gain w.r.t. robot states and actions under computation and communication resource constraints. As such, each robot can dynamically switch between a point-mass motion planner executed locally to guarantee safety (e.g., path-following) and a full-shape motion planner executed non-locally to guarantee efficiency (e.g., overtaking). The crux to EARN is a two-time scale integrated decision-planning algorithm based on bilevel mixed-integer optimization, and a fast conditional collision avoidance algorithm based on penalty dual decomposition. We validate the performance of EARN in indoor simulation, outdoor simulation, and real-world environments. Experiments show that EARN achieves significantly smaller navigation time and collision ratios than state-of-the-art navigation approaches.

Semidefinite programs simulate approximate message passing robustly

  • Authors: Authors: Misha Ivkov, Tselil Schramm
  • Subjects: Data Structures and Algorithms (cs.DS); Machine Learning (cs.LG); Statistics Theory (math.ST); Machine Learning (stat.ML)
  • Arxiv link:
  • Pdf link:
  • Abstract Approximate message passing (AMP) is a family of iterative algorithms that generalize matrix power iteration. AMP algorithms are known to optimally solve many average-case optimization problems. In this paper, we show that a large class of AMP algorithms can be simulated in polynomial time by \emph{local statistics hierarchy} semidefinite programs (SDPs), even when an unknown principal minor of measure $1/\mathrm{polylog}(\mathrm{dimension})$ is adversarially corrupted. Ours are the first robust guarantees for many of these problems. Further, our results offer an interesting counterpoint to strong lower bounds against less constrained SDP relaxations for average-case max-cut-gain (a.k.a. "optimizing the Sherrington-Kirkpatrick Hamiltonian") and other problems.

Integrating Sensing, Communication, and Power Transfer: Multiuser Beamforming Design

  • Authors: Authors: Ziqin Zhou, Xiaoyang Li, Guangxu Zhu, Jie Xu, Kaibin Huang, Shuguang Cui
  • Subjects: Information Theory (cs.IT); Signal Processing (eess.SP)
  • Arxiv link:
  • Pdf link:
  • Abstract In the sixth-generation (6G) networks, massive low-power devices are expected to sense environment and deliver tremendous data. To enhance the radio resource efficiency, the integrated sensing and communication (ISAC) technique exploits the sensing and communication functionalities of signals, while the simultaneous wireless information and power transfer (SWIPT) techniques utilizes the same signals as the carriers for both information and power delivery. The further combination of ISAC and SWIPT leads to the advanced technology namely integrated sensing, communication, and power transfer (ISCPT). In this paper, a multi-user multiple-input multiple-output (MIMO) ISCPT system is considered, where a base station equipped with multiple antennas transmits messages to multiple information receivers (IRs), transfers power to multiple energy receivers (ERs), and senses a target simultaneously. The sensing target can be regarded as a point or an extended surface. When the locations of IRs and ERs are separated, the MIMO beamforming designs are optimized to improve the sensing performance while meeting the communication and power transfer requirements. The resultant non-convex optimization problems are solved based on a series of techniques including Schur complement transformation and rank reduction. Moreover, when the IRs and ERs are co-located, the power splitting factors are jointly optimized together with the beamformers to balance the performance of communication and power transfer. To better understand the performance of ISCPT, the target positioning problem is further investigated. Simulations are conducted to verify the effectiveness of our proposed designs, which also reveal a performance tradeoff among sensing, communication, and power transfer.

Network-Level Integrated Sensing and Communication: Interference Management and BS Coordination Using Stochastic Geometry

  • Authors: Authors: Kaitao Meng, Christos Masouros, Guangji Chen, Fan Liu
  • Subjects: Information Theory (cs.IT); Signal Processing (eess.SP)
  • Arxiv link:
  • Pdf link:
  • Abstract In this work, we study integrated sensing and communication (ISAC) networks with the aim of effectively balancing sensing and communication (S&C) performance at the network level. Focusing on monostatic sensing, the tool of stochastic geometry is exploited to capture the S&C performance, which facilitates us to illuminate key cooperative dependencies in the ISAC network and optimize key network-level parameters. Based on the derived tractable expression of area spectral efficiency (ASE), we formulate the optimization problem to maximize the network performance from the view point of two joint S&C metrics. Towards this end, we further jointly optimize the cooperative BS cluster sizes for S&C and the serving/probing numbers of users/targets to achieve a flexible tradeoff between S&C at the network level. It is verified that interference nulling can effectively improve the average data rate and radar information rate. Surprisingly, the optimal communication tradeoff for the case of the ASE maximization tends to employ all spacial resources towards multiplexing and diversity gain, without interference nulling. By contrast, for the sensing objectives, resource allocation tends to eliminate certain interference especially when the antenna resources are sufficient, because the inter-cell interference becomes a more dominant factor affecting sensing performance. Furthermore, we prove that the ratio of the optimal number of users and the number of transmit antennas is a constant value when the communication performance is optimal. Simulation results demonstrate that the proposed cooperative ISAC scheme achieves a substantial gain in S&C performance at the network level.

New Horizons in Parameter Regularization: A Constraint Approach

  • Authors: Authors: Jörg K.H. Franke, Michael Hefenbrock, Gregor Koehler, Frank Hutter
  • Subjects: Machine Learning (cs.LG)
  • Arxiv link:
  • Pdf link:
  • Abstract This work presents constrained parameter regularization (CPR), an alternative to traditional weight decay. Instead of applying a constant penalty uniformly to all parameters, we enforce an upper bound on a statistical measure (e.g., the L$_2$-norm) of individual parameter groups. This reformulates learning as a constrained optimization problem. To solve this, we utilize an adaptation of the augmented Lagrangian method. Our approach allows for varying regularization strengths across different parameter groups, removing the need for explicit penalty coefficients in the regularization terms. CPR only requires two hyperparameters and introduces no measurable runtime overhead. We offer empirical evidence of CPR's effectiveness through experiments in the "grokking" phenomenon, image classification, and language modeling. Our findings show that CPR can counteract the effects of grokking, and it consistently matches or surpasses the performance of traditional weight decay.

Automatic cable harness layout routing in a customizable 3D environment

  • Authors: Authors: T. Karlsson, E. Åblad, T. Hermansson, J. S. Carlson, G. Tenfält
  • Subjects: Computational Engineering, Finance, and Science (cs.CE)
  • Arxiv link:
  • Pdf link:
  • Abstract Designing cable harnesses can be time-consuming and complex due to many design and manufacturing aspects and rules. Automating the design process can help to fulfil these rules, speed up the process, and optimize the design. To accommodate this, we formulate a harness routing optimization problem to minimize cable lengths, maximize bundling by rewarding shared paths, and optimize the cables' spatial location with respect to case-specific information of the routing environment, e.g., zones to avoid. A deterministic and computationally effective cable harness routing algorithm has been developed to solve the routing problem and is used to generate a set of cable harness topology candidates and approximate the Pareto front. Our approach was tested against a stochastic and an exact solver and our routing algorithm generated objective function values better than the stochastic approach and close to the exact solver. Our algorithm was able to find solutions, some of them being proven to be near-optimal, for three industrial-sized 3D cases within reasonable time (in magnitude of seconds to minutes) and the computation times were comparable to those of the stochastic approach.

Scalable and Effective Generative Information Retrieval

  • Authors: Authors: Hansi Zeng, Chen Luo, Bowen Jin, Sheikh Muhammad Sarwar, Tianxin Wei, Hamed Zamani
  • Subjects: Information Retrieval (cs.IR)
  • Arxiv link:
  • Pdf link:
  • Abstract Recent research has shown that transformer networks can be used as differentiable search indexes by representing each document as a sequences of document ID tokens. These generative retrieval models cast the retrieval problem to a document ID generation problem for each given query. Despite their elegant design, existing generative retrieval models only perform well on artificially-constructed and small-scale collections. This has led to serious skepticism in the research community on their real-world impact. This paper represents an important milestone in generative retrieval research by showing, for the first time, that generative retrieval models can be trained to perform effectively on large-scale standard retrieval benchmarks. For doing so, we propose RIPOR- an optimization framework for generative retrieval that can be adopted by any encoder-decoder architecture. RIPOR is designed based on two often-overlooked fundamental design considerations in generative retrieval. First, given the sequential decoding nature of document ID generation, assigning accurate relevance scores to documents based on the whole document ID sequence is not sufficient. To address this issue, RIPOR introduces a novel prefix-oriented ranking optimization algorithm. Second, initial document IDs should be constructed based on relevance associations between queries and documents, instead of the syntactic and semantic information in the documents. RIPOR addresses this issue using a relevance-based document ID construction approach that quantizes relevance-based representations learned for documents. Evaluation on MSMARCO and TREC Deep Learning Track reveals that RIPOR surpasses state-of-the-art generative retrieval models by a large margin (e.g., 30.5% MRR improvements on MS MARCO Dev Set), and perform better on par with popular dense retrieval models.

Unified incremental nonlinear controller for the transition control of a hybrid dual-axis tilting rotor quad-plane

  • Authors: Authors: Alessandro Mancinelli, Bart D.W. Remes, Guido C.H.E. de Croon, Ewoud J.J. Smeur
  • Subjects: Systems and Control (eess.SY)
  • Arxiv link:
  • Pdf link:
  • Abstract Overactuated Tilt Rotor Unmanned Aerial Vehicles are renowned for exceptional wind resistance and a broad operational range, which poses complex control challenges due to non-affine dynamics. Traditional solutions employ multi-state switched logic controllers for transitions. Our study introduces a novel unified incremental nonlinear controller for overactuated dual-axis tilting rotor quad-planes, seamlessly managing pitch, roll, and physical actuator commands. The control allocation problem is addressed using a SQP iterative optimization algorithm, well-suited for nonlinear actuator effectiveness in thrust vectoring vehicles. The controller design integrates desired roll and pitch angle inputs. These desired attitude angles are autonomously managed by the controller and then conveyed to the vehicle during slow airspeed phases, when the vehicle maintains its 6 DOF. We incorporate an AoA protection logic to prevent wing stall and a yaw rate reference model for coordinated turns. Flight tests confirm the controller's effectiveness in transitioning from hovering to forward flight, achieving desired vertical and lateral accelerations, and reverting to hovering.

PsyEval: A Comprehensive Large Language Model Evaluation Benchmark for Mental Health

  • Authors: Authors: Haoan Jin, Siyuan Chen, Mengyue Wu, Kenny Q. Zhu
  • Subjects: Computation and Language (cs.CL)
  • Arxiv link:
  • Pdf link:
  • Abstract Recently, there has been a growing interest in utilizing large language models (LLMs) in mental health research, with studies showcasing their remarkable capabilities, such as disease detection. However, there is currently a lack of a comprehensive benchmark for evaluating the capability of LLMs in this domain. Therefore, we address this gap by introducing the first comprehensive benchmark tailored to the unique characteristics of the mental health domain. This benchmark encompasses a total of six sub-tasks, covering three dimensions, to systematically assess the capabilities of LLMs in the realm of mental health. We have designed corresponding concise prompts for each sub-task. And we comprehensively evaluate a total of eight advanced LLMs using our benchmark. Experiment results not only demonstrate significant room for improvement in current LLMs concerning mental health but also unveil potential directions for future model optimization.

On the Computation of the Gaussian Rate-Distortion-Perception Function

  • Authors: Authors: Giuseppe Serra, Photios A. Stavrou, Marios Kountouris
  • Subjects: Information Theory (cs.IT); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG); Networking and Internet Architecture (cs.NI)
  • Arxiv link:
  • Pdf link:
  • Abstract In this paper, we study the computation of the rate-distortion-perception function (RDPF) for a multivariate Gaussian source under mean squared error (MSE) distortion and, respectively, Kullback-Leibler divergence, geometric Jensen-Shannon divergence, squared Hellinger distance, and squared Wasserstein-2 distance perception metrics. To this end, we first characterize the analytical bounds of the scalar Gaussian RDPF for the aforementioned divergence functions, also providing the RDPF-achieving forward "test-channel" realization. Focusing on the multivariate case, we establish that, for tensorizable distortion and perception metrics, the optimal solution resides on the vector space spanned by the eigenvector of the source covariance matrix. Consequently, the multivariate optimization problem can be expressed as a function of the scalar Gaussian RDPFs of the source marginals, constrained by global distortion and perception levels. Leveraging this characterization, we design an alternating minimization scheme based on the block nonlinear Gauss-Seidel method, which optimally solves the problem while identifying the Gaussian RDPF-achieving realization. Furthermore, the associated algorithmic embodiment is provided, as well as the convergence and the rate of convergence characterization. Lastly, for the "perfect realism" regime, the analytical solution for the multivariate Gaussian RDPF is obtained. We corroborate our results with numerical simulations and draw connections to existing results.

Keyword: adam

A Spectral Diffusion Prior for Hyperspectral Image Super-Resolution

  • Authors: Authors: Jianjun Liu, Zebin Wu, Liang Xiao
  • Subjects: Computer Vision and Pattern Recognition (cs.CV); Image and Video Processing (eess.IV)
  • Arxiv link:
  • Pdf link:
  • Abstract Fusion-based hyperspectral image (HSI) super-resolution aims to produce a high-spatial-resolution HSI by fusing a low-spatial-resolution HSI and a high-spatial-resolution multispectral image. Such a HSI super-resolution process can be modeled as an inverse problem, where the prior knowledge is essential for obtaining the desired solution. Motivated by the success of diffusion models, we propose a novel spectral diffusion prior for fusion-based HSI super-resolution. Specifically, we first investigate the spectrum generation problem and design a spectral diffusion model to model the spectral data distribution. Then, in the framework of maximum a posteriori, we keep the transition information between every two neighboring states during the reverse generative process, and thereby embed the knowledge of trained spectral diffusion model into the fusion problem in the form of a regularization term. At last, we treat each generation step of the final optimization problem as its subproblem, and employ the Adam to solve these subproblems in a reverse sequence. Experimental results conducted on both synthetic and real datasets demonstrate the effectiveness of the proposed approach. The code of the proposed approach will be available on

Keyword: gradient

Uncertainty Quantification in Neural-Network Based Pain Intensity Estimation

  • Authors: Authors: Burcu Ozek, Zhenyuan Lu, Srinivasan Radhakrishnan, Sagar Kamarthi
  • Subjects: Machine Learning (cs.LG)
  • Arxiv link:
  • Pdf link:
  • Abstract Improper pain management can lead to severe physical or mental consequences, including suffering, and an increased risk of opioid dependency. Assessing the presence and severity of pain is imperative to prevent such outcomes and determine the appropriate intervention. However, the evaluation of pain intensity is challenging because different individuals experience pain differently. To overcome this, researchers have employed machine learning models to evaluate pain intensity objectively. However, these efforts have primarily focused on point estimation of pain, disregarding the inherent uncertainty and variability present in the data and model. Consequently, the point estimates provide only partial information for clinical decision-making. This study presents a neural network-based method for objective pain interval estimation, incorporating uncertainty quantification. This work explores three algorithms: the bootstrap method, lower and upper bound estimation (LossL) optimized by genetic algorithm, and modified lower and upper bound estimation (LossS) optimized by gradient descent algorithm. Our empirical results reveal that LossS outperforms the other two by providing a narrower prediction interval. As LossS outperforms, we assessed its performance in three different scenarios for pain assessment: (1) a generalized approach (single model for the entire population), (2) a personalized approach (separate model for each individual), and (3) a hybrid approach (separate model for each cluster of individuals). Our findings demonstrate the hybrid approach's superior performance, with notable practicality in clinical contexts. It has the potential to be a valuable tool for clinicians, enabling objective pain intensity assessment while taking uncertainty into account. This capability is crucial in facilitating effective pain management and reducing the risks associated with improper treatment.

Review of AlexNet for Medical Image Classification

  • Authors: Authors: Wenhao Tang, Junding Sun, Shuihua Wang, Yudong Zhang
  • Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
  • Arxiv link:
  • Pdf link:
  • Abstract In recent years, the rapid development of deep learning has led to a wide range of applications in the field of medical image classification. The variants of neural network models with ever-increasing performance share some commonalities: to try to mitigate overfitting, improve generalization, avoid gradient vanishing and exploding, etc. AlexNet first utilizes the dropout technique to mitigate overfitting and the ReLU activation function to avoid gradient vanishing. Therefore, we focus our discussion on AlexNet, which has contributed greatly to the development of CNNs in 2012. After reviewing over 40 papers, including journal papers and conference papers, we give a narrative on the technical details, advantages, and application areas of AlexNet.

Federated Learning for Sparse Principal Component Analysis

  • Authors: Authors: Sin Cheng Ciou, Pin Jui Chen, Elvin Y. Tseng, Yuh-Jye Lee
  • Subjects: Machine Learning (cs.LG); Distributed, Parallel, and Cluster Computing (cs.DC); Information Theory (cs.IT); Machine Learning (stat.ML)
  • Arxiv link:
  • Pdf link:
  • Abstract In the rapidly evolving realm of machine learning, algorithm effectiveness often faces limitations due to data quality and availability. Traditional approaches grapple with data sharing due to legal and privacy concerns. The federated learning framework addresses this challenge. Federated learning is a decentralized approach where model training occurs on client sides, preserving privacy by keeping data localized. Instead of sending raw data to a central server, only model updates are exchanged, enhancing data security. We apply this framework to Sparse Principal Component Analysis (SPCA) in this work. SPCA aims to attain sparse component loadings while maximizing data variance for improved interpretability. Beside the L1 norm regularization term in conventional SPCA, we add a smoothing function to facilitate gradient-based optimization methods. Moreover, in order to improve computational efficiency, we introduce a least squares approximation to original SPCA. This enables analytic solutions on the optimization processes, leading to substantial computational improvements. Within the federated framework, we formulate SPCA as a consensus optimization problem, which can be solved using the Alternating Direction Method of Multipliers (ADMM). Our extensive experiments involve both IID and non-IID random features across various data owners. Results on synthetic and public datasets affirm the efficacy of our federated SPCA approach.

Using Stochastic Gradient Descent to Smooth Nonconvex Functions: Analysis of Implicit Graduated Optimization with Optimal Noise Scheduling

  • Authors: Authors: Naoki Sato, Hideaki Iiduka
  • Subjects: Machine Learning (cs.LG); Optimization and Control (math.OC)
  • Arxiv link:
  • Pdf link:
  • Abstract The graduated optimization approach is a heuristic method for finding globally optimal solutions for nonconvex functions and has been theoretically analyzed in several studies. This paper defines a new family of nonconvex functions for graduated optimization, discusses their sufficient conditions, and provides a convergence analysis of the graduated optimization algorithm for them. It shows that stochastic gradient descent (SGD) with mini-batch stochastic gradients has the effect of smoothing the function, the degree of which is determined by the learning rate and batch size. This finding provides theoretical insights from a graduated optimization perspective on why large batch sizes fall into sharp local minima, why decaying learning rates and increasing batch sizes are superior to fixed learning rates and batch sizes, and what the optimal learning rate scheduling is. To the best of our knowledge, this is the first paper to provide a theoretical explanation for these aspects. Moreover, a new graduated optimization framework that uses a decaying learning rate and increasing batch size is analyzed and experimental results of image classification that support our theoretical findings are reported.

Accelerating Toeplitz Neural Network with Constant-time Inference Complexity

  • Authors: Authors: Zhen Qin, Yiran Zhong
  • Subjects: Computation and Language (cs.CL)
  • Arxiv link:
  • Pdf link:
  • Abstract Toeplitz Neural Networks (TNNs) have exhibited outstanding performance in various sequence modeling tasks. They outperform commonly used Transformer-based models while benefiting from log-linear space-time complexities. On the other hand, State Space Models (SSMs) achieve lower performance than TNNs in language modeling but offer the advantage of constant inference complexity. In this paper, we aim to combine the strengths of TNNs and SSMs by converting TNNs to SSMs during inference, thereby enabling TNNs to achieve the same constant inference complexities as SSMs. To accomplish this, we formulate the conversion process as an optimization problem and provide a closed-form solution. We demonstrate how to transform the target equation into a Vandermonde linear system problem, which can be efficiently solved using the Discrete Fourier Transform (DFT). Notably, our method requires no training and maintains numerical stability. It can be also applied to any LongConv-based model. To assess its effectiveness, we conduct extensive experiments on language modeling tasks across various settings. Additionally, we compare our method to other gradient-descent solutions, highlighting the superior numerical stability of our approach. The source code is available at

X-GRL: An Empirical Assessment of Explainable GNN-DRL in B5G/6G Networks

  • Authors: Authors: Farhad Rezazadeh, Sergio Barrachina-MuNoz, Engin Zeydan, Houbing Song, K.P. Subbalakshmi, Josep Mangues-Bafalluy
  • Subjects: Networking and Internet Architecture (cs.NI)
  • Arxiv link:
  • Pdf link:
  • Abstract The rapid development of artificial intelligence (AI) techniques has triggered a revolution in beyond fifth-generation (B5G) and upcoming sixth-generation (6G) mobile networks. Despite these advances, efficient resource allocation in dynamic and complex networks remains a major challenge. This paper presents an experimental implementation of deep reinforcement learning (DRL) enhanced with graph neural networks (GNNs) on a real 5G testbed. The method addresses the explainability of GNNs by evaluating the importance of each edge in determining the model's output. The custom sampling functions feed the data into the proposed GNN-driven Monte Carlo policy gradient (REINFORCE) agent to optimize the gNodeB (gNB) radio resources according to the specific traffic demands. The demo demonstrates real-time visualization of network parameters and superior performance compared to benchmarks.

Efficiently Escaping Saddle Points for Non-Convex Policy Optimization

  • Authors: Authors: Sadegh Khorasani, Saber Salehkaleybar, Negar Kiyavash, Niao He, Matthias Grossglauser
  • Subjects: Machine Learning (cs.LG); Optimization and Control (math.OC)
  • Arxiv link:
  • Pdf link:
  • Abstract Policy gradient (PG) is widely used in reinforcement learning due to its scalability and good performance. In recent years, several variance-reduced PG methods have been proposed with a theoretical guarantee of converging to an approximate first-order stationary point (FOSP) with the sample complexity of $O(\epsilon^{-3})$. However, FOSPs could be bad local optima or saddle points. Moreover, these algorithms often use importance sampling (IS) weights which could impair the statistical effectiveness of variance reduction. In this paper, we propose a variance-reduced second-order method that uses second-order information in the form of Hessian vector products (HVP) and converges to an approximate second-order stationary point (SOSP) with sample complexity of $\tilde{O}(\epsilon^{-3})$. This rate improves the best-known sample complexity for achieving approximate SOSPs by a factor of $O(\epsilon^{-0.5})$. Moreover, the proposed variance reduction technique bypasses IS weights by using HVP terms. Our experimental results show that the proposed algorithm outperforms the state of the art and is more robust to changes in random seeds.

Unsupervised approaches based on optimal transport and convex analysis for inverse problems in imaging

  • Authors: Authors: Marcello Carioni, Subhadip Mukherjee, Hong Ye Tan, Junqi Tang
  • Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG); Optimization and Control (math.OC)
  • Arxiv link:
  • Pdf link:
  • Abstract Unsupervised deep learning approaches have recently become one of the crucial research areas in imaging owing to their ability to learn expressive and powerful reconstruction operators even when paired high-quality training data is scarcely available. In this chapter, we review theoretically principled unsupervised learning schemes for solving imaging inverse problems, with a particular focus on methods rooted in optimal transport and convex analysis. We begin by reviewing the optimal transport-based unsupervised approaches such as the cycle-consistency-based models and learned adversarial regularization methods, which have clear probabilistic interpretations. Subsequently, we give an overview of a recent line of works on provably convergent learned optimization algorithms applied to accelerate the solution of imaging inverse problems, alongside their dedicated unsupervised training schemes. We also survey a number of provably convergent plug-and-play algorithms (based on gradient-step deep denoisers), which are among the most important and widely applied unsupervised approaches for imaging problems. At the end of this survey, we provide an overview of a few related unsupervised learning frameworks that complement our focused schemes. Together with a detailed survey, we provide an overview of the key mathematical results that underlie the methods reviewed in the chapter to keep our discussion self-contained.

A high-order local discontinuous Galerkin method for the $p$-Laplace equation

  • Authors: Authors: Yue Wu, Yan Xu
  • Subjects: Numerical Analysis (math.NA)
  • Arxiv link:
  • Pdf link:
  • Abstract We study the high-order local discontinuous Galerkin (LDG) method for the $p$-Laplace equation. We reformulate our spatial discretization as an equivalent convex minimization problem and use a preconditioned gradient descent method as the nonlinear solver. For the first time, a weighted preconditioner that provides $hk$-independent convergence is applied in the LDG setting. For polynomial order $k \geqslant 1$, we rigorously establish the solvability of our scheme and provide a priori error estimates in a mesh-dependent energy norm. Our error estimates are under a different and non-equivalent distance from existing LDG results. For arbitrarily high-order polynomials under the assumption that the exact solution has enough regularity, the error estimates demonstrate the potential for high-order accuracy. Our numerical results exhibit the desired convergence speed facilitated by the preconditioner, and we observe best convergence rates in gradient variables in alignment with linear LDG, and optimal rates in the primal variable when $1 < p \leqslant 2$.

Keyword: super-resolution

A Spectral Diffusion Prior for Hyperspectral Image Super-Resolution

  • Authors: Authors: Jianjun Liu, Zebin Wu, Liang Xiao
  • Subjects: Computer Vision and Pattern Recognition (cs.CV); Image and Video Processing (eess.IV)
  • Arxiv link:
  • Pdf link:
  • Abstract Fusion-based hyperspectral image (HSI) super-resolution aims to produce a high-spatial-resolution HSI by fusing a low-spatial-resolution HSI and a high-spatial-resolution multispectral image. Such a HSI super-resolution process can be modeled as an inverse problem, where the prior knowledge is essential for obtaining the desired solution. Motivated by the success of diffusion models, we propose a novel spectral diffusion prior for fusion-based HSI super-resolution. Specifically, we first investigate the spectrum generation problem and design a spectral diffusion model to model the spectral data distribution. Then, in the framework of maximum a posteriori, we keep the transition information between every two neighboring states during the reverse generative process, and thereby embed the knowledge of trained spectral diffusion model into the fusion problem in the form of a regularization term. At last, we treat each generation step of the final optimization problem as its subproblem, and employ the Adam to solve these subproblems in a reverse sequence. Experimental results conducted on both synthetic and real datasets demonstrate the effectiveness of the proposed approach. The code of the proposed approach will be available on

zoq avatar Nov 16 '23 07:11 zoq