arxiv-updates icon indicating copy to clipboard operation
arxiv-updates copied to clipboard

New submissions for Mon, 16 Oct 23

Open zoq opened this issue 1 year ago • 0 comments

Keyword: sgd

Spatially Continuous Non-Contact Cold Sensation Presentation Based on Low-Temperature Airflows

  • Authors: Authors: Koyo Makino, Jiayi Xu, Akiko Kaneko, Naoto Ienaga, Yoshihiro Kuroda
  • Subjects: Human-Computer Interaction (cs.HC)
  • Arxiv link: https://arxiv.org/abs/2310.08853
  • Pdf link: https://arxiv.org/pdf/2310.08853
  • Abstract Our perception of cold enriches our understanding of the world and allows us to interact with it. Therefore, the presentation of cold sensations will be beneficial in improving the sense of immersion and presence in virtual reality and the metaverse. This study proposed a novel method for spatially continuous cold sensation presentation based on low-temperature airflows. We defined the shortest distance between two airflows perceived as different cold stimuli as a local cold stimulus group discrimination threshold (LCSGDT). By setting the distance between airflows within the LCSGDT, spatially continuous cold sensations can be achieved with an optimal number of cold airflows. We hypothesized that the LCSGDTs are related to the heat-transfer capability of airflows and developed a model to relate them. We investigated the LCSGDTs at a flow rate of 25 L/min and presentation distances ranging from 10 to 50 mm. The results showed that under these conditions, the LCSGDTs are 131.4 $\pm$ 1.9 mm, and the heat-transfer capacity of the airflow corresponding to these LCSGDTs is an almost constant value, that is, 0.92.

Does Graph Distillation See Like Vision Dataset Counterpart?

  • Authors: Authors: Beining Yang, Kai Wang, Qingyun Sun, Cheng Ji, Xingcheng Fu, Hao Tang, Yang You, Jianxin Li
  • Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI)
  • Arxiv link: https://arxiv.org/abs/2310.09192
  • Pdf link: https://arxiv.org/pdf/2310.09192
  • Abstract Training on large-scale graphs has achieved remarkable results in graph representation learning, but its cost and storage have attracted increasing concerns. Existing graph condensation methods primarily focus on optimizing the feature matrices of condensed graphs while overlooking the impact of the structure information from the original graphs. To investigate the impact of the structure information, we conduct analysis from the spectral domain and empirically identify substantial Laplacian Energy Distribution (LED) shifts in previous works. Such shifts lead to poor performance in cross-architecture generalization and specific tasks, including anomaly detection and link prediction. In this paper, we propose a novel Structure-broadcasting Graph Dataset Distillation (SGDD) scheme for broadcasting the original structure information to the generation of the synthetic one, which explicitly prevents overlooking the original structure information. Theoretically, the synthetic graphs by SGDD are expected to have smaller LED shifts than previous works, leading to superior performance in both cross-architecture settings and specific tasks. We validate the proposed SGDD across 9 datasets and achieve state-of-the-art results on all of them: for example, on the YelpChi dataset, our approach maintains 98.6% test accuracy of training on the original graph dataset with 1,000 times saving on the scale of the graph. Moreover, we empirically evaluate there exist 17.6% ~ 31.4% reductions in LED shift crossing 9 datasets. Extensive experiments and analysis verify the effectiveness and necessity of the proposed designs. The code is available in the GitHub repository: https://github.com/RingBDStack/SGDD.

Keyword: optimization

Defect Analysis of 3D Printed Cylinder Object Using Transfer Learning Approaches

  • Authors: Authors: Md Manjurul Ahsan, Shivakumar Raman, Zahed Siddique
  • Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
  • Arxiv link: https://arxiv.org/abs/2310.08645
  • Pdf link: https://arxiv.org/pdf/2310.08645
  • Abstract Additive manufacturing (AM) is gaining attention across various industries like healthcare, aerospace, and automotive. However, identifying defects early in the AM process can reduce production costs and improve productivity - a key challenge. This study explored the effectiveness of machine learning (ML) approaches, specifically transfer learning (TL) models, for defect detection in 3D-printed cylinders. Images of cylinders were analyzed using models including VGG16, VGG19, ResNet50, ResNet101, InceptionResNetV2, and MobileNetV2. Performance was compared across two datasets using accuracy, precision, recall, and F1-score metrics. In the first study, VGG16, InceptionResNetV2, and MobileNetV2 achieved perfect scores. In contrast, ResNet50 had the lowest performance, with an average F1-score of 0.32. Similarly, in the second study, MobileNetV2 correctly classified all instances, while ResNet50 struggled with more false positives and fewer true positives, resulting in an F1-score of 0.75. Overall, the findings suggest certain TL models like MobileNetV2 can deliver high accuracy for AM defect classification, although performance varies across algorithms. The results provide insights into model optimization and integration needs for reliable automated defect analysis during 3D printing. By identifying the top-performing TL techniques, this study aims to enhance AM product quality through robust image-based monitoring and inspection.

SplitBeam: Effective and Efficient Beamforming in Wi-Fi Networks Through Split Computing

  • Authors: Authors: Niloofar Bahadori, Yoshitomo Matsubara, Marco Levorato, Francesco Restuccia
  • Subjects: Networking and Internet Architecture (cs.NI); Machine Learning (cs.LG); Signal Processing (eess.SP)
  • Arxiv link: https://arxiv.org/abs/2310.08656
  • Pdf link: https://arxiv.org/pdf/2310.08656
  • Abstract Modern IEEE 802.11 (Wi-Fi) networks extensively rely on multiple-input multiple-output (MIMO) to significantly improve throughput. To correctly beamform MIMO transmissions, the access point needs to frequently acquire a beamforming matrix (BM) from each connected station. However, the size of the matrix grows with the number of antennas and subcarriers, resulting in an increasing amount of airtime overhead and computational load at the station. Conventional approaches come with either excessive computational load or loss of beamforming precision. For this reason, we propose SplitBeam, a new framework where we train a split deep neural network (DNN) to directly output the BM given the channel state information (CSI) matrix as input. We formulate and solve a bottleneck optimization problem (BOP) to keep computation, airtime overhead, and bit error rate (BER) below application requirements. We perform extensive experimental CSI collection with off-the-shelf Wi-Fi devices in two distinct environments and compare the performance of SplitBeam with the standard IEEE 802.11 algorithm for BM feedback and the state-of-the-art DNN-based approach LB-SciFi. Our experimental results show that SplitBeam reduces the beamforming feedback size and computational complexity by respectively up to 81% and 84% while maintaining BER within about 10^-3 of existing approaches. We also implement the SplitBeam DNNs on FPGA hardware to estimate the end-to-end BM reporting delay, and show that the latter is less than 10 milliseconds in the most complex scenario, which is the target channel sounding frequency in realistic multi-user MIMO scenarios.

Learning RL-Policies for Joint Beamforming Without Exploration: A Batch Constrained Off-Policy Approach

  • Authors: Authors: Heasung Kim, Sravan Ankireddy
  • Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Signal Processing (eess.SP)
  • Arxiv link: https://arxiv.org/abs/2310.08660
  • Pdf link: https://arxiv.org/pdf/2310.08660
  • Abstract In this project, we consider the problem of network parameter optimization for rate maximization. We frame this as a joint optimization problem of power control, beam forming, and interference cancellation. We consider the setting where multiple Base Stations (BSs) are communicating with multiple user equipments (UEs). Because of the exponential computational complexity of brute force search, we instead solve this non-convex optimization problem using deep reinforcement learning (RL) techniques. The modern communication systems are notorious for their difficulty in exactly modeling their behaviour. This limits us in using RL based algorithms as interaction with the environment is needed for the agent to explore and learn efficiently. Further, it is ill advised to deploy the algorithm in real world for exploration and learning because of the high cost of failure. In contrast to the previous RL-based solutions proposed, such as deep-Q network (DQN) based control, we propose taking an offline model based approach. We specifically consider discrete batch constrained deep Q-learning (BCQ) and show that performance similar to DQN can be acheived with only a fraction of the data and without the need for exploration. This results in maximizing sample efficiency and minimizing risk in the deployment of a new algorithm to commercial networks. We provide the entire resource of the project, including code and data, at the following link: https://github.com/Heasung-Kim/ safe-rl-deployment-for-5g.

SSG2: A new modelling paradigm for semantic segmentation

  • Authors: Authors: Foivos I. Diakogiannis, Suzanne Furby, Peter Caccetta, Xiaoliang Wu, Rodrigo Ibata, Ondrej Hlinka, John Taylor
  • Subjects: Computer Vision and Pattern Recognition (cs.CV)
  • Arxiv link: https://arxiv.org/abs/2310.08671
  • Pdf link: https://arxiv.org/pdf/2310.08671
  • Abstract State-of-the-art models in semantic segmentation primarily operate on single, static images, generating corresponding segmentation masks. This one-shot approach leaves little room for error correction, as the models lack the capability to integrate multiple observations for enhanced accuracy. Inspired by work on semantic change detection, we address this limitation by introducing a methodology that leverages a sequence of observables generated for each static input image. By adding this "temporal" dimension, we exploit strong signal correlations between successive observations in the sequence to reduce error rates. Our framework, dubbed SSG2 (Semantic Segmentation Generation 2), employs a dual-encoder, single-decoder base network augmented with a sequence model. The base model learns to predict the set intersection, union, and difference of labels from dual-input images. Given a fixed target input image and a set of support images, the sequence model builds the predicted mask of the target by synthesizing the partial views from each sequence step and filtering out noise. We evaluate SSG2 across three diverse datasets: UrbanMonitor, featuring orthoimage tiles from Darwin, Australia with five spectral bands and 0.2m spatial resolution; ISPRS Potsdam, which includes true orthophoto images with multiple spectral bands and a 5cm ground sampling distance; and ISIC2018, a medical dataset focused on skin lesion segmentation, particularly melanoma. The SSG2 model demonstrates rapid convergence within the first few tens of epochs and significantly outperforms UNet-like baseline models with the same number of gradient updates. However, the addition of the temporal dimension results in an increased memory footprint. While this could be a limitation, it is offset by the advent of higher-memory GPUs and coding optimizations.

An Efficient Resilient MPC Scheme via Constraint Tightening against Cyberattacks: Application to Vehicle Cruise Control

  • Authors: Authors: Milad Farsi, Shuhao Bian, Nasser L. Azad, Xiaobing Shi, Andrew Walenstein
  • Subjects: Systems and Control (eess.SY); Optimization and Control (math.OC)
  • Arxiv link: https://arxiv.org/abs/2310.08680
  • Pdf link: https://arxiv.org/pdf/2310.08680
  • Abstract We propose a novel framework for designing a resilient Model Predictive Control (MPC) targeting uncertain linear systems under cyber attack. Assuming a periodic attack scenario, we model the system under Denial of Service (DoS) attack, also with measurement noise, as an uncertain linear system with parametric and additive uncertainty. To detect anomalies, we employ a Kalman filter-based approach. Then, through our observations of the intensity of the launched attack, we determine a range of possible values for the system matrices, as well as establish bounds of the additive uncertainty for the equivalent uncertain system. Leveraging a recent constraint tightening robust MPC method, we present an optimization-based resilient algorithm. Accordingly, we compute the uncertainty bounds and corresponding constraints offline for various attack magnitudes. Then, this data can be used efficiently in the MPC computations online. We demonstrate the effectiveness of the developed framework on the Adaptive Cruise Control (ACC) problem.

Kernel-Elastic Autoencoder for Molecular Design

  • Authors: Authors: Haote Li, Yu Shee, Brandon Allen, Federica Maschietto, Victor Batista
  • Subjects: Machine Learning (cs.LG)
  • Arxiv link: https://arxiv.org/abs/2310.08685
  • Pdf link: https://arxiv.org/pdf/2310.08685
  • Abstract We introduce the Kernel-Elastic Autoencoder (KAE), a self-supervised generative model based on the transformer architecture with enhanced performance for molecular design. KAE is formulated based on two novel loss functions: modified maximum mean discrepancy and weighted reconstruction. KAE addresses the long-standing challenge of achieving valid generation and accurate reconstruction at the same time. KAE achieves remarkable diversity in molecule generation while maintaining near-perfect reconstructions on the independent testing dataset, surpassing previous molecule-generating models. KAE enables conditional generation and allows for decoding based on beam search resulting in state-of-the-art performance in constrained optimizations. Furthermore, KAE can generate molecules conditional to favorable binding affinities in docking applications as confirmed by AutoDock Vina and Glide scores, outperforming all existing candidates from the training dataset. Beyond molecular design, we anticipate KAE could be applied to solve problems by generation in a wide range of applications.

Provably Robust Cost-Sensitive Learning via Randomized Smoothing

  • Authors: Authors: Yuan Xin, Michael Backes, Xiao Zhang
  • Subjects: Machine Learning (cs.LG); Cryptography and Security (cs.CR)
  • Arxiv link: https://arxiv.org/abs/2310.08732
  • Pdf link: https://arxiv.org/pdf/2310.08732
  • Abstract We focus on learning adversarially robust classifiers under a cost-sensitive scenario, where the potential harm of different classwise adversarial transformations is encoded in a binary cost matrix. Existing methods are either empirical that cannot certify robustness or suffer from inherent scalability issues. In this work, we study whether randomized smoothing, a more scalable robustness certification framework, can be leveraged to certify cost-sensitive robustness. Built upon a notion of cost-sensitive certified radius, we show how to adapt the standard randomized smoothing certification pipeline to produce tight robustness guarantees for any cost matrix. In addition, with fine-grained certified radius optimization schemes specifically designed for different data subgroups, we propose an algorithm to train smoothed classifiers that are optimized for cost-sensitive robustness. Extensive experiments on image benchmarks and a real-world medical dataset demonstrate the superiority of our method in achieving significantly improved performance of certified cost-sensitive robustness while having a negligible impact on overall accuracy.

Evolutionary Dynamic Optimization and Machine Learning

  • Authors: Authors: Abdennour Boulesnane
  • Subjects: Neural and Evolutionary Computing (cs.NE); Machine Learning (cs.LG)
  • Arxiv link: https://arxiv.org/abs/2310.08748
  • Pdf link: https://arxiv.org/pdf/2310.08748
  • Abstract Evolutionary Computation (EC) has emerged as a powerful field of Artificial Intelligence, inspired by nature's mechanisms of gradual development. However, EC approaches often face challenges such as stagnation, diversity loss, computational complexity, population initialization, and premature convergence. To overcome these limitations, researchers have integrated learning algorithms with evolutionary techniques. This integration harnesses the valuable data generated by EC algorithms during iterative searches, providing insights into the search space and population dynamics. Similarly, the relationship between evolutionary algorithms and Machine Learning (ML) is reciprocal, as EC methods offer exceptional opportunities for optimizing complex ML tasks characterized by noisy, inaccurate, and dynamic objective functions. These hybrid techniques, known as Evolutionary Machine Learning (EML), have been applied at various stages of the ML process. EC techniques play a vital role in tasks such as data balancing, feature selection, and model training optimization. Moreover, ML tasks often require dynamic optimization, for which Evolutionary Dynamic Optimization (EDO) is valuable. This paper presents the first comprehensive exploration of reciprocal integration between EDO and ML. The study aims to stimulate interest in the evolutionary learning community and inspire innovative contributions in this domain.

Constrained Bayesian Optimization with Adaptive Active Learning of Unknown Constraints

  • Authors: Authors: Fengxue Zhang, Zejie Zhu, Yuxin Chen
  • Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI)
  • Arxiv link: https://arxiv.org/abs/2310.08751
  • Pdf link: https://arxiv.org/pdf/2310.08751
  • Abstract Optimizing objectives under constraints, where both the objectives and constraints are black box functions, is a common scenario in real-world applications such as scientific experimental design, design of medical therapies, and industrial process optimization. One popular approach to handling these complex scenarios is Bayesian Optimization (BO). In terms of theoretical behavior, BO is relatively well understood in the unconstrained setting, where its principles have been well explored and validated. However, when it comes to constrained Bayesian optimization (CBO), the existing framework often relies on heuristics or approximations without the same level of theoretical guarantees. In this paper, we delve into the theoretical and practical aspects of constrained Bayesian optimization, where the objective and constraints can be independently evaluated and are subject to noise. By recognizing that both the objective and constraints can help identify high-confidence regions of interest (ROI), we propose an efficient CBO framework that intersects the ROIs identified from each aspect to determine the general ROI. The ROI, coupled with a novel acquisition function that adaptively balances the optimization of the objective and the identification of feasible regions, enables us to derive rigorous theoretical justifications for its performance. We showcase the efficiency and robustness of our proposed CBO framework through empirical evidence and discuss the fundamental challenge of deriving practical regret bounds for CBO algorithms.

Implicit Shape and Appearance Priors for Few-Shot Full Head Reconstruction

  • Authors: Authors: Pol Caselles, Eduard Ramon, Jaime Garcia, Gil Triginer, Francesc Moreno-Noguer
  • Subjects: Computer Vision and Pattern Recognition (cs.CV)
  • Arxiv link: https://arxiv.org/abs/2310.08784
  • Pdf link: https://arxiv.org/pdf/2310.08784
  • Abstract Recent advancements in learning techniques that employ coordinate-based neural representations have yielded remarkable results in multi-view 3D reconstruction tasks. However, these approaches often require a substantial number of input views (typically several tens) and computationally intensive optimization procedures to achieve their effectiveness. In this paper, we address these limitations specifically for the problem of few-shot full 3D head reconstruction. We accomplish this by incorporating a probabilistic shape and appearance prior into coordinate-based representations, enabling faster convergence and improved generalization when working with only a few input images (even as low as a single image). During testing, we leverage this prior to guide the fitting process of a signed distance function using a differentiable renderer. By incorporating the statistical prior alongside parallelizable ray tracing and dynamic caching strategies, we achieve an efficient and accurate approach to few-shot full 3D head reconstruction. Moreover, we extend the H3DS dataset, which now comprises 60 high-resolution 3D full head scans and their corresponding posed images and masks, which we use for evaluation purposes. By leveraging this dataset, we demonstrate the remarkable capabilities of our approach in achieving state-of-the-art results in geometry reconstruction while being an order of magnitude faster than previous approaches.

DeltaSpace: A Semantic-aligned Feature Space for Flexible Text-guided Image Editing

  • Authors: Authors: Yueming Lyu, Kang Zhao, Bo Peng, Yue Jiang, Yingya Zhang, Jing Dong
  • Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
  • Arxiv link: https://arxiv.org/abs/2310.08785
  • Pdf link: https://arxiv.org/pdf/2310.08785
  • Abstract Text-guided image editing faces significant challenges to training and inference flexibility. Much literature collects large amounts of annotated image-text pairs to train text-conditioned generative models from scratch, which is expensive and not efficient. After that, some approaches that leverage pre-trained vision-language models are put forward to avoid data collection, but they are also limited by either per text-prompt optimization or inference-time hyper-parameters tuning. To address these issues, we investigate and identify a specific space, referred to as CLIP DeltaSpace, where the CLIP visual feature difference of two images is semantically aligned with the CLIP textual feature difference of their corresponding text descriptions. Based on DeltaSpace, we propose a novel framework called DeltaEdit, which maps the CLIP visual feature differences to the latent space directions of a generative model during the training phase, and predicts the latent space directions from the CLIP textual feature differences during the inference phase. And this design endows DeltaEdit with two advantages: (1) text-free training; (2) generalization to various text prompts for zero-shot inference. Extensive experiments validate the effectiveness and versatility of DeltaEdit with different generative models, including both the GAN model and the diffusion model, in achieving flexible text-guided image editing. Code is available at https://github.com/Yueming6568/DeltaEdit.

Incentive Mechanism Design for Distributed Ensemble Learning

  • Authors: Authors: Chao Huang, Pengchao Han, Jianwei Huang
  • Subjects: Computer Science and Game Theory (cs.GT); Machine Learning (cs.LG)
  • Arxiv link: https://arxiv.org/abs/2310.08792
  • Pdf link: https://arxiv.org/pdf/2310.08792
  • Abstract Distributed ensemble learning (DEL) involves training multiple models at distributed learners, and then combining their predictions to improve performance. Existing related studies focus on DEL algorithm design and optimization but ignore the important issue of incentives, without which self-interested learners may be unwilling to participate in DEL. We aim to fill this gap by presenting a first study on the incentive mechanism design for DEL. Our proposed mechanism specifies both the amount of training data and reward for learners with heterogeneous computation and communication costs. One design challenge is to have an accurate understanding regarding how learners' diversity (in terms of training data) affects the ensemble accuracy. To this end, we decompose the ensemble accuracy into a diversity-precision tradeoff to guide the mechanism design. Another challenge is that the mechanism design involves solving a mixed-integer program with a large search space. To this end, we propose an alternating algorithm that iteratively updates each learner's training data size and reward. We prove that under mild conditions, the algorithm converges. Numerical results using MNIST dataset show an interesting result: our proposed mechanism may prefer a lower level of learner diversity to achieve a higher ensemble accuracy.

Mitigating Bias for Question Answering Models by Tracking Bias Influence

  • Authors: Authors: Mingyu Derek Ma, Jiun-Yu Kao, Arpit Gupta, Yu-Hsiang Lin, Wenbo Zhao, Tagyoung Chung, Wei Wang, Kai-Wei Chang, Nanyun Peng
  • Subjects: Computation and Language (cs.CL); Artificial Intelligence (cs.AI); Computers and Society (cs.CY); Machine Learning (cs.LG)
  • Arxiv link: https://arxiv.org/abs/2310.08795
  • Pdf link: https://arxiv.org/pdf/2310.08795
  • Abstract Models of various NLP tasks have been shown to exhibit stereotypes, and the bias in the question answering (QA) models is especially harmful as the output answers might be directly consumed by the end users. There have been datasets to evaluate bias in QA models, while bias mitigation technique for the QA models is still under-explored. In this work, we propose BMBI, an approach to mitigate the bias of multiple-choice QA models. Based on the intuition that a model would lean to be more biased if it learns from a biased example, we measure the bias level of a query instance by observing its influence on another instance. If the influenced instance is more biased, we derive that the query instance is biased. We then use the bias level detected as an optimization objective to form a multi-task learning setting in addition to the original QA task. We further introduce a new bias evaluation metric to quantify bias in a comprehensive and sensitive way. We show that our method could be applied to multiple QA formulations across multiple bias categories. It can significantly reduce the bias level in all 9 bias categories in the BBQ dataset while maintaining comparable QA accuracy.

A High-throughput and Secure Coded Blockchain for IoT

  • Authors: Authors: Amirhossein Taherpour, Xiaodong Wang
  • Subjects: Distributed, Parallel, and Cluster Computing (cs.DC)
  • Arxiv link: https://arxiv.org/abs/2310.08822
  • Pdf link: https://arxiv.org/pdf/2310.08822
  • Abstract We propose a new coded blockchain scheme suitable for the Internet-of-Things (IoT) network. In contrast to existing works for coded blockchains, especially blockchain-of-things, the proposed scheme is more realistic, practical, and secure while achieving high throughput. This is accomplished by: 1) modeling the variety of transactions using a reward model, based on which an optimization problem is solved to select transactions that are more accessible and cheaper computational-wise to be processed together; 2) a transaction-based and lightweight consensus algorithm that emphasizes on using the minimum possible number of miners for processing the transactions; and 3) employing the raptor codes with linear-time encoding and decoding which results in requiring lower storage to maintain the blockchain and having a higher throughput. We provide detailed analysis and simulation results on the proposed scheme and compare it with the state-of-the-art coded IoT blockchain schemes including Polyshard and LCB, to show the advantages of our proposed scheme in terms of security, storage, decentralization, and throughput.

Overcoming Recency Bias of Normalization Statistics in Continual Learning: Balance and Adaptation

  • Authors: Authors: Yilin Lyu, Liyuan Wang, Xingxing Zhang, Zicheng Sun, Hang Su, Jun Zhu, Liping Jing
  • Subjects: Machine Learning (cs.LG)
  • Arxiv link: https://arxiv.org/abs/2310.08855
  • Pdf link: https://arxiv.org/pdf/2310.08855
  • Abstract Continual learning entails learning a sequence of tasks and balancing their knowledge appropriately. With limited access to old training samples, much of the current work in deep neural networks has focused on overcoming catastrophic forgetting of old tasks in gradient-based optimization. However, the normalization layers provide an exception, as they are updated interdependently by the gradient and statistics of currently observed training samples, which require specialized strategies to mitigate recency bias. In this work, we focus on the most popular Batch Normalization (BN) and provide an in-depth theoretical analysis of its sub-optimality in continual learning. Our analysis demonstrates the dilemma between balance and adaptation of BN statistics for incremental tasks, which potentially affects training stability and generalization. Targeting on these particular challenges, we propose Adaptive Balance of BN (AdaB$^2$N), which incorporates appropriately a Bayesian-based strategy to adapt task-wise contributions and a modified momentum to balance BN statistics, corresponding to the training and testing stages. By implementing BN in a continual learning fashion, our approach achieves significant performance gains across a wide range of benchmarks, particularly for the challenging yet realistic online scenarios (e.g., up to 7.68%, 6.86% and 4.26% on Split CIFAR-10, Split CIFAR-100 and Split Mini-ImageNet, respectively). Our code is available at https://github.com/lvyilin/AdaB2N.

High-efficiency and positivity-preserving stabilized SAV methods for gradient flows

  • Authors: Authors: Zhengguang Liu, Yanrong Zhang, Xiaoli Li
  • Subjects: Numerical Analysis (math.NA)
  • Arxiv link: https://arxiv.org/abs/2310.08893
  • Pdf link: https://arxiv.org/pdf/2310.08893
  • Abstract The scalar auxiliary variable (SAV)-type methods are very popular techniques for solving various nonlinear dissipative systems. Compared to the semi-implicit method, the baseline SAV method can keep a modified energy dissipation law but doubles the computational cost. The general SAV approach does not add additional computation but needs to solve a semi-implicit solution in advance, which may potentially compromise the accuracy and stability. In this paper, we construct a novel first- and second-order unconditional energy stable and positivity-preserving stabilized SAV (PS-SAV) schemes for $L^2$ and $H^{-1}$ gradient flows. The constructed schemes can reduce nearly half computational cost of the baseline SAV method and preserve its accuracy and stability simultaneously. Meanwhile, the introduced auxiliary variable is always positive while the baseline SAV cannot guarantee this positivity-preserving property. Unconditionally energy dissipation laws are derived for the proposed numerical schemes. We also establish a rigorous error analysis of the first-order scheme for the Allen-Cahn type equation in $l^{\infty}(0,T; H^1(\Omega) ) $ norm. In addition we propose an energy optimization technique to optimize the modified energy close to the original energy. Several interesting numerical examples are presented to demonstrate the accuracy and effectiveness of the proposed methods.

Migrant Resettlement by Evolutionary Multi-objective Optimization

  • Authors: Authors: Dan-Xuan Liu, Yu-Ran Gu, Chao Qian, Xin Mu, Ke Tang
  • Subjects: Neural and Evolutionary Computing (cs.NE)
  • Arxiv link: https://arxiv.org/abs/2310.08896
  • Pdf link: https://arxiv.org/pdf/2310.08896
  • Abstract Migration has been a universal phenomenon, which brings opportunities as well as challenges for global development. As the number of migrants (e.g., refugees) increases rapidly in recent years, a key challenge faced by each country is the problem of migrant resettlement. This problem has attracted scientific research attention, from the perspective of maximizing the employment rate. Previous works mainly formulated migrant resettlement as an approximately submodular optimization problem subject to multiple matroid constraints and employed the greedy algorithm, whose performance, however, may be limited due to its greedy nature. In this paper, we propose a new framework MR-EMO based on Evolutionary Multi-objective Optimization, which reformulates Migrant Resettlement as a bi-objective optimization problem that maximizes the expected number of employed migrants and minimizes the number of dispatched migrants simultaneously, and employs a Multi-Objective Evolutionary Algorithm (MOEA) to solve the bi-objective problem. We implement MR-EMO using three MOEAs, the popular NSGA-II, MOEA/D as well as the theoretically grounded GSEMO. To further improve the performance of MR-EMO, we propose a specific MOEA, called GSEMO-SR, using matrix-swap mutation and repair mechanism, which has a better ability to search for feasible solutions. We prove that MR-EMO using either GSEMO or GSEMO-SR can achieve better theoretical guarantees than the previous greedy algorithm. Experimental results under the interview and coordination migration models clearly show the superiority of MR-EMO (with either NSGA-II, MOEA/D, GSEMO or GSEMO-SR) over previous algorithms, and that using GSEMO-SR leads to the best performance of MR-EMO.

Scalarization for Multi-Task and Multi-Domain Learning at Scale

  • Authors: Authors: Amelie Royer, Tijmen Blankevoort, Babak Ehteshami Bejnordi
  • Subjects: Machine Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV)
  • Arxiv link: https://arxiv.org/abs/2310.08910
  • Pdf link: https://arxiv.org/pdf/2310.08910
  • Abstract Training a single model on multiple input domains and/or output tasks allows for compressing information from multiple sources into a unified backbone hence improves model efficiency. It also enables potential positive knowledge transfer across tasks/domains, leading to improved accuracy and data-efficient training. However, optimizing such networks is a challenge, in particular due to discrepancies between the different tasks or domains: Despite several hypotheses and solutions proposed over the years, recent work has shown that uniform scalarization training, i.e., simply minimizing the average of the task losses, yields on-par performance with more costly SotA optimization methods. This raises the issue of how well we understand the training dynamics of multi-task and multi-domain networks. In this work, we first devise a large-scale unified analysis of multi-domain and multi-task learning to better understand the dynamics of scalarization across varied task/domain combinations and model sizes. Following these insights, we then propose to leverage population-based training to efficiently search for the optimal scalarization weights when dealing with a large number of tasks or domains.

Attacking The Assortativity Coefficient Under A Rewiring Strategy

  • Authors: Authors: Shuo Zou, Bo Zhou, Qi Xuan
  • Subjects: Social and Information Networks (cs.SI); Physics and Society (physics.soc-ph)
  • Arxiv link: https://arxiv.org/abs/2310.08924
  • Pdf link: https://arxiv.org/pdf/2310.08924
  • Abstract Degree correlation is an important characteristic of networks, which is usually quantified by the assortativity coefficient. However, concerns arise about changing the assortativity coefficient of a network when networks suffer from adversarial attacks. In this paper, we analyze the factors that affect the assortativity coefficient and study the optimization problem of maximizing or minimizing the assortativity coefficient (r) in rewired networks with $k$ pairs of edges. We propose a greedy algorithm and formulate the optimization problem using integer programming to obtain the optimal solution for this problem. Through experiments, we demonstrate the reasonableness and effectiveness of our proposed algorithm. For example, rewired edges 10% in the ER network, the assortativity coefficient improved by 60%.

Data-driven aerodynamic shape design with distributionally robust optimization approaches

  • Authors: Authors: Long Chen, Jan Rottmayer, Lisa Kusch, Nicolas R. Gauger, Yinyu Ye
  • Subjects: Computational Engineering, Finance, and Science (cs.CE); Optimization and Control (math.OC)
  • Arxiv link: https://arxiv.org/abs/2310.08931
  • Pdf link: https://arxiv.org/pdf/2310.08931
  • Abstract We formulate and solve data-driven aerodynamic shape design problems with distributionally robust optimization (DRO) approaches. Building on the findings of the work \cite{gotoh2018robust}, we study the connections between a class of DRO and the Taguchi method in the context of robust design optimization. Our preliminary computational experiments on aerodynamic shape optimization in transonic turbulent flow show promising design results.

Online Adaptive Disparity Estimation for Dynamic Scenes in Structured Light Systems

  • Authors: Authors: Rukun Qiao, Hiroshi Kawasaki, Hongbin Zha
  • Subjects: Computer Vision and Pattern Recognition (cs.CV)
  • Arxiv link: https://arxiv.org/abs/2310.08934
  • Pdf link: https://arxiv.org/pdf/2310.08934
  • Abstract In recent years, deep neural networks have shown remarkable progress in dense disparity estimation from dynamic scenes in monocular structured light systems. However, their performance significantly drops when applied in unseen environments. To address this issue, self-supervised online adaptation has been proposed as a solution to bridge this performance gap. Unlike traditional fine-tuning processes, online adaptation performs test-time optimization to adapt networks to new domains. Therefore, achieving fast convergence during the adaptation process is critical for attaining satisfactory accuracy. In this paper, we propose an unsupervised loss function based on long sequential inputs. It ensures better gradient directions and faster convergence. Our loss function is designed using a multi-frame pattern flow, which comprises a set of sparse trajectories of the projected pattern along the sequence. We estimate the sparse pseudo ground truth with a confidence mask using a filter-based method, which guides the online adaptation process. Our proposed framework significantly improves the online adaptation speed and achieves superior performance on unseen data.

UniParser: Multi-Human Parsing with Unified Correlation Representation Learning

  • Authors: Authors: Jiaming Chu, Lei Jin, Junliang Xing, Jian Zhao
  • Subjects: Computer Vision and Pattern Recognition (cs.CV)
  • Arxiv link: https://arxiv.org/abs/2310.08984
  • Pdf link: https://arxiv.org/pdf/2310.08984
  • Abstract Multi-human parsing is an image segmentation task necessitating both instance-level and fine-grained category-level information. However, prior research has typically processed these two types of information through separate branches and distinct output formats, leading to inefficient and redundant frameworks. This paper introduces UniParser, which integrates instance-level and category-level representations in three key aspects: 1) we propose a unified correlation representation learning approach, allowing our network to learn instance and category features within the cosine space; 2) we unify the form of outputs of each modules as pixel-level segmentation results while supervising instance and category features using a homogeneous label accompanied by an auxiliary loss; and 3) we design a joint optimization procedure to fuse instance and category representations. By virtual of unifying instance-level and category-level output, UniParser circumvents manually designed post-processing techniques and surpasses state-of-the-art methods, achieving 49.3% AP on MHPv2.0 and 60.4% AP on CIHP. We will release our source code, pretrained models, and online demos to facilitate future studies.

μ-DDRL: A QoS-Aware Distributed Deep Reinforcement Learning Technique for Service Offloading in Fog computing Environments

  • Authors: Authors: Mohammad Goudarzi, Maria A. Rodriguez, Majid Sarvi, Rajkumar Buyya
  • Subjects: Distributed, Parallel, and Cluster Computing (cs.DC)
  • Arxiv link: https://arxiv.org/abs/2310.09003
  • Pdf link: https://arxiv.org/pdf/2310.09003
  • Abstract Fog and Edge computing extend cloud services to the proximity of end users, allowing many Internet of Things (IoT) use cases, particularly latency-critical applications. Smart devices, such as traffic and surveillance cameras, often do not have sufficient resources to process computation-intensive and latency-critical services. Hence, the constituent parts of services can be offloaded to nearby Edge/Fog resources for processing and storage. However, making offloading decisions for complex services in highly stochastic and dynamic environments is an important, yet difficult task. Recently, Deep Reinforcement Learning (DRL) has been used in many complex service offloading problems; however, existing techniques are most suitable for centralized environments, and their convergence to the best-suitable solutions is slow. In addition, constituent parts of services often have predefined data dependencies and quality of service constraints, which further intensify the complexity of service offloading. To solve these issues, we propose a distributed DRL technique following the actor-critic architecture based on Asynchronous Proximal Policy Optimization (APPO) to achieve efficient and diverse distributed experience trajectory generation. Also, we employ PPO clipping and V-trace techniques for off-policy correction for faster convergence to the most suitable service offloading solutions. The results obtained demonstrate that our technique converges quickly, offers high scalability and adaptability, and outperforms its counterparts by improving the execution time of heterogeneous services.

SAI: Solving AI Tasks with Systematic Artificial Intelligence in Communication Network

  • Authors: Authors: Lei Yao, Yong Zhang, Zilong Yan, Jialu Tian
  • Subjects: Artificial Intelligence (cs.AI)
  • Arxiv link: https://arxiv.org/abs/2310.09049
  • Pdf link: https://arxiv.org/pdf/2310.09049
  • Abstract In the rapid development of artificial intelligence, solving complex AI tasks is a crucial technology in intelligent mobile networks. Despite the good performance of specialized AI models in intelligent mobile networks, they are unable to handle complicated AI tasks. To address this challenge, we propose Systematic Artificial Intelligence (SAI), which is a framework designed to solve AI tasks by leveraging Large Language Models (LLMs) and JSON-format intent-based input to connect self-designed model library and database. Specifically, we first design a multi-input component, which simultaneously integrates Large Language Models (LLMs) and JSON-format intent-based inputs to fulfill the diverse intent requirements of different users. In addition, we introduce a model library module based on model cards which employ model cards to pairwise match between different modules for model composition. Model cards contain the corresponding model's name and the required performance metrics. Then when receiving user network requirements, we execute each subtask for multiple selected model combinations and provide output based on the execution results and LLM feedback. By leveraging the language capabilities of LLMs and the abundant AI models in the model library, SAI can complete numerous complex AI tasks in the communication network, achieving impressive results in network optimization, resource allocation, and other challenging tasks.

Qilin-Med: Multi-stage Knowledge Injection Advanced Medical Large Language Model

  • Authors: Authors: Qichen Ye, Junling Liu, Dading Chong, Peilin Zhou, Yining Hua, Andrew Liu
  • Subjects: Computation and Language (cs.CL)
  • Arxiv link: https://arxiv.org/abs/2310.09089
  • Pdf link: https://arxiv.org/pdf/2310.09089
  • Abstract Integrating large language models (LLMs) into healthcare presents potential but faces challenges. Directly pre-training LLMs for domains like medicine is resource-heavy and sometimes unfeasible. Sole reliance on Supervised Fine-tuning (SFT) can result in overconfident predictions and may not tap into domain specific insights. Addressing these challenges, we present a multi-stage training method combining Domain-specific Continued Pre-training (DCPT), SFT, and Direct Preference Optimization (DPO). A notable contribution of our study is the introduction of a 3Gb Chinese Medicine (ChiMed) dataset, encompassing medical question answering, plain texts, knowledge graphs, and dialogues, segmented into three training stages. The medical LLM trained with our pipeline, Qilin-Med, exhibits significant performance boosts. In the CPT and SFT phases, it achieves 38.4% and 40.0% accuracy on the CMExam, surpassing Baichuan-7B's 33.5%. In the DPO phase, on the Huatuo-26M test set, it scores 16.66 in BLEU-1 and 27.44 in ROUGE1, outperforming the SFT's 12.69 and 24.21. This highlights the strength of our training approach in refining LLMs for medical applications.

Unseen Image Synthesis with Diffusion Models

  • Authors: Authors: Ye Zhu, Yu Wu, Zhiwei Deng, Olga Russakovsky, Yan Yan
  • Subjects: Machine Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV)
  • Arxiv link: https://arxiv.org/abs/2310.09213
  • Pdf link: https://arxiv.org/pdf/2310.09213
  • Abstract While the current trend in the generative field is scaling up towards larger models and more training data for generalized domain representations, we go the opposite direction in this work by synthesizing unseen domain images without additional training. We do so via latent sampling and geometric optimization using pre-trained and frozen Denoising Diffusion Probabilistic Models (DDPMs) on single-domain datasets. Our key observation is that DDPMs pre-trained even just on single-domain images are already equipped with sufficient representation abilities to reconstruct arbitrary images from the inverted latent encoding following bi-directional deterministic diffusion and denoising trajectories. This motivates us to investigate the statistical and geometric behaviors of the Out-Of-Distribution (OOD) samples from unseen image domains in the latent spaces along the denoising chain. Notably, we theoretically and empirically show that the inverted OOD samples also establish Gaussians that are distinguishable from the original In-Domain (ID) samples in the intermediate latent spaces, which allows us to sample from them directly. Geometrical domain-specific and model-dependent information of the unseen subspace (e.g., sample-wise distance and angles) is used to further optimize the sampled OOD latent encodings from the estimated Gaussian prior. We conduct extensive analysis and experiments using pre-trained diffusion models (DDPM, iDDPM) on different datasets (AFHQ, CelebA-HQ, LSUN-Church, and LSUN-Bedroom), proving the effectiveness of this novel perspective to explore and re-think the diffusion models' data synthesis generalization ability.

Keyword: adam

There is no result

Keyword: gradient

Deep Reinforcement Learning for Autonomous Vehicle Intersection Navigation

  • Authors: Authors: Badr Ben Elallid, Hamza El Alaoui, Nabil Benamar
  • Subjects: Robotics (cs.RO); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
  • Arxiv link: https://arxiv.org/abs/2310.08595
  • Pdf link: https://arxiv.org/pdf/2310.08595
  • Abstract In this paper, we explore the challenges associated with navigating complex T-intersections in dense traffic scenarios for autonomous vehicles (AVs). Reinforcement learning algorithms have emerged as a promising approach to address these challenges by enabling AVs to make safe and efficient decisions in real-time. Here, we address the problem of efficiently and safely navigating T-intersections using a lower-cost, single-agent approach based on the Twin Delayed Deep Deterministic Policy Gradient (TD3) reinforcement learning algorithm. We show that our TD3-based method, when trained and tested in the CARLA simulation platform, demonstrates stable convergence and improved safety performance in various traffic densities. Our results reveal that the proposed approach enables the AV to effectively navigate T-intersections, outperforming previous methods in terms of travel delays, collision minimization, and overall cost. This study contributes to the growing body of knowledge on reinforcement learning applications in autonomous driving and highlights the potential of single-agent, cost-effective methods for addressing more complex driving scenarios and advancing reinforcement learning algorithms in the future.

Time-vectorized numerical integration for systems of ODEs

  • Authors: Authors: Mark C. Messner, Tianchen Hu, Tianju Chen
  • Subjects: Numerical Analysis (math.NA); Machine Learning (cs.LG); Machine Learning (stat.ML)
  • Arxiv link: https://arxiv.org/abs/2310.08649
  • Pdf link: https://arxiv.org/pdf/2310.08649
  • Abstract Stiff systems of ordinary differential equations (ODEs) and sparse training data are common in scientific problems. This paper describes efficient, implicit, vectorized methods for integrating stiff systems of ordinary differential equations through time and calculating parameter gradients with the adjoint method. The main innovation is to vectorize the problem both over the number of independent times series and over a batch or "chunk" of sequential time steps, effectively vectorizing the assembly of the implicit system of ODEs. The block-bidiagonal structure of the linearized implicit system for the backward Euler method allows for further vectorization using parallel cyclic reduction (PCR). Vectorizing over both axes of the input data provides a higher bandwidth of calculations to the computing device, allowing even problems with comparatively sparse data to fully utilize modern GPUs and achieving speed ups of greater than 100x, compared to standard, sequential time integration. We demonstrate the advantages of implicit, vectorized time integration with several example problems, drawn from both analytical stiff and non-stiff ODE models as well as neural ODE models. We also describe and provide a freely available open-source implementation of the methods developed here.

SSG2: A new modelling paradigm for semantic segmentation

  • Authors: Authors: Foivos I. Diakogiannis, Suzanne Furby, Peter Caccetta, Xiaoliang Wu, Rodrigo Ibata, Ondrej Hlinka, John Taylor
  • Subjects: Computer Vision and Pattern Recognition (cs.CV)
  • Arxiv link: https://arxiv.org/abs/2310.08671
  • Pdf link: https://arxiv.org/pdf/2310.08671
  • Abstract State-of-the-art models in semantic segmentation primarily operate on single, static images, generating corresponding segmentation masks. This one-shot approach leaves little room for error correction, as the models lack the capability to integrate multiple observations for enhanced accuracy. Inspired by work on semantic change detection, we address this limitation by introducing a methodology that leverages a sequence of observables generated for each static input image. By adding this "temporal" dimension, we exploit strong signal correlations between successive observations in the sequence to reduce error rates. Our framework, dubbed SSG2 (Semantic Segmentation Generation 2), employs a dual-encoder, single-decoder base network augmented with a sequence model. The base model learns to predict the set intersection, union, and difference of labels from dual-input images. Given a fixed target input image and a set of support images, the sequence model builds the predicted mask of the target by synthesizing the partial views from each sequence step and filtering out noise. We evaluate SSG2 across three diverse datasets: UrbanMonitor, featuring orthoimage tiles from Darwin, Australia with five spectral bands and 0.2m spatial resolution; ISPRS Potsdam, which includes true orthophoto images with multiple spectral bands and a 5cm ground sampling distance; and ISIC2018, a medical dataset focused on skin lesion segmentation, particularly melanoma. The SSG2 model demonstrates rapid convergence within the first few tens of epochs and significantly outperforms UNet-like baseline models with the same number of gradient updates. However, the addition of the temporal dimension results in an increased memory footprint. While this could be a limitation, it is offset by the advent of higher-memory GPUs and coding optimizations.

Investigating the Robustness and Properties of Detection Transformers (DETR) Toward Difficult Images

  • Authors: Authors: Zhao Ning Zou, Yuhang Zhang, Robert Wijaya
  • Subjects: Computer Vision and Pattern Recognition (cs.CV)
  • Arxiv link: https://arxiv.org/abs/2310.08772
  • Pdf link: https://arxiv.org/pdf/2310.08772
  • Abstract Transformer-based object detectors (DETR) have shown significant performance across machine vision tasks, ultimately in object detection. This detector is based on a self-attention mechanism along with the transformer encoder-decoder architecture to capture the global context in the image. The critical issue to be addressed is how this model architecture can handle different image nuisances, such as occlusion and adversarial perturbations. We studied this issue by measuring the performance of DETR with different experiments and benchmarking the network with convolutional neural network (CNN) based detectors like YOLO and Faster-RCNN. We found that DETR performs well when it comes to resistance to interference from information loss in occlusion images. Despite that, we found that the adversarial stickers put on the image require the network to produce a new unnecessary set of keys, queries, and values, which in most cases, results in a misdirection of the network. DETR also performed poorer than YOLOv5 in the image corruption benchmark. Furthermore, we found that DETR depends heavily on the main query when making a prediction, which leads to imbalanced contributions between queries since the main query receives most of the gradient flow.

Overcoming Recency Bias of Normalization Statistics in Continual Learning: Balance and Adaptation

  • Authors: Authors: Yilin Lyu, Liyuan Wang, Xingxing Zhang, Zicheng Sun, Hang Su, Jun Zhu, Liping Jing
  • Subjects: Machine Learning (cs.LG)
  • Arxiv link: https://arxiv.org/abs/2310.08855
  • Pdf link: https://arxiv.org/pdf/2310.08855
  • Abstract Continual learning entails learning a sequence of tasks and balancing their knowledge appropriately. With limited access to old training samples, much of the current work in deep neural networks has focused on overcoming catastrophic forgetting of old tasks in gradient-based optimization. However, the normalization layers provide an exception, as they are updated interdependently by the gradient and statistics of currently observed training samples, which require specialized strategies to mitigate recency bias. In this work, we focus on the most popular Batch Normalization (BN) and provide an in-depth theoretical analysis of its sub-optimality in continual learning. Our analysis demonstrates the dilemma between balance and adaptation of BN statistics for incremental tasks, which potentially affects training stability and generalization. Targeting on these particular challenges, we propose Adaptive Balance of BN (AdaB$^2$N), which incorporates appropriately a Bayesian-based strategy to adapt task-wise contributions and a modified momentum to balance BN statistics, corresponding to the training and testing stages. By implementing BN in a continual learning fashion, our approach achieves significant performance gains across a wide range of benchmarks, particularly for the challenging yet realistic online scenarios (e.g., up to 7.68%, 6.86% and 4.26% on Split CIFAR-10, Split CIFAR-100 and Split Mini-ImageNet, respectively). Our code is available at https://github.com/lvyilin/AdaB2N.

Re-initialization-free Level Set Method via Molecular Beam Epitaxy Equation Regularization for Image Segmentation

  • Authors: Authors: Fanghui Song, Jiebao Sun, Shengzhu Shi, Zhichang Guo, Dazhi Zhang
  • Subjects: Computer Vision and Pattern Recognition (cs.CV)
  • Arxiv link: https://arxiv.org/abs/2310.08861
  • Pdf link: https://arxiv.org/pdf/2310.08861
  • Abstract Variational level set method has become a powerful tool in image segmentation due to its ability to handle complex topological changes and maintain continuity and smoothness in the process of evolution. However its evolution process can be unstable, which results in over flatted or over sharpened contours and segmentation failure. To improve the accuracy and stability of evolution, we propose a high-order level set variational segmentation method integrated with molecular beam epitaxy (MBE) equation regularization. This method uses the crystal growth in the MBE process to limit the evolution of the level set function, and thus can avoid the re-initialization in the evolution process and regulate the smoothness of the segmented curve. It also works for noisy images with intensity inhomogeneity, which is a challenge in image segmentation. To solve the variational model, we derive the gradient flow and design scalar auxiliary variable (SAV) scheme coupled with fast Fourier transform (FFT), which can significantly improve the computational efficiency compared with the traditional semi-implicit and semi-explicit scheme. Numerical experiments show that the proposed method can generate smooth segmentation curves, retain fine segmentation targets and obtain robust segmentation results of small objects. Compared to existing level set methods, this model is state-of-the-art in both accuracy and efficiency.

High-efficiency and positivity-preserving stabilized SAV methods for gradient flows

  • Authors: Authors: Zhengguang Liu, Yanrong Zhang, Xiaoli Li
  • Subjects: Numerical Analysis (math.NA)
  • Arxiv link: https://arxiv.org/abs/2310.08893
  • Pdf link: https://arxiv.org/pdf/2310.08893
  • Abstract The scalar auxiliary variable (SAV)-type methods are very popular techniques for solving various nonlinear dissipative systems. Compared to the semi-implicit method, the baseline SAV method can keep a modified energy dissipation law but doubles the computational cost. The general SAV approach does not add additional computation but needs to solve a semi-implicit solution in advance, which may potentially compromise the accuracy and stability. In this paper, we construct a novel first- and second-order unconditional energy stable and positivity-preserving stabilized SAV (PS-SAV) schemes for $L^2$ and $H^{-1}$ gradient flows. The constructed schemes can reduce nearly half computational cost of the baseline SAV method and preserve its accuracy and stability simultaneously. Meanwhile, the introduced auxiliary variable is always positive while the baseline SAV cannot guarantee this positivity-preserving property. Unconditionally energy dissipation laws are derived for the proposed numerical schemes. We also establish a rigorous error analysis of the first-order scheme for the Allen-Cahn type equation in $l^{\infty}(0,T; H^1(\Omega) ) $ norm. In addition we propose an energy optimization technique to optimize the modified energy close to the original energy. Several interesting numerical examples are presented to demonstrate the accuracy and effectiveness of the proposed methods.

Online Adaptive Disparity Estimation for Dynamic Scenes in Structured Light Systems

  • Authors: Authors: Rukun Qiao, Hiroshi Kawasaki, Hongbin Zha
  • Subjects: Computer Vision and Pattern Recognition (cs.CV)
  • Arxiv link: https://arxiv.org/abs/2310.08934
  • Pdf link: https://arxiv.org/pdf/2310.08934
  • Abstract In recent years, deep neural networks have shown remarkable progress in dense disparity estimation from dynamic scenes in monocular structured light systems. However, their performance significantly drops when applied in unseen environments. To address this issue, self-supervised online adaptation has been proposed as a solution to bridge this performance gap. Unlike traditional fine-tuning processes, online adaptation performs test-time optimization to adapt networks to new domains. Therefore, achieving fast convergence during the adaptation process is critical for attaining satisfactory accuracy. In this paper, we propose an unsupervised loss function based on long sequential inputs. It ensures better gradient directions and faster convergence. Our loss function is designed using a multi-frame pattern flow, which comprises a set of sparse trajectories of the projected pattern along the sequence. We estimate the sparse pseudo ground truth with a confidence mask using a filter-based method, which guides the online adaptation process. Our proposed framework significantly improves the online adaptation speed and achieves superior performance on unseen data.

Subspace Adaptation Prior for Few-Shot Learning

  • Authors: Authors: Mike Huisman, Aske Plaat, Jan N. van Rijn
  • Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Machine Learning (stat.ML)
  • Arxiv link: https://arxiv.org/abs/2310.09028
  • Pdf link: https://arxiv.org/pdf/2310.09028
  • Abstract Gradient-based meta-learning techniques aim to distill useful prior knowledge from a set of training tasks such that new tasks can be learned more efficiently with gradient descent. While these methods have achieved successes in various scenarios, they commonly adapt all parameters of trainable layers when learning new tasks. This neglects potentially more efficient learning strategies for a given task distribution and may be susceptible to overfitting, especially in few-shot learning where tasks must be learned from a limited number of examples. To address these issues, we propose Subspace Adaptation Prior (SAP), a novel gradient-based meta-learning algorithm that jointly learns good initialization parameters (prior knowledge) and layer-wise parameter subspaces in the form of operation subsets that should be adaptable. In this way, SAP can learn which operation subsets to adjust with gradient descent based on the underlying task distribution, simultaneously decreasing the risk of overfitting when learning new tasks. We demonstrate that this ability is helpful as SAP yields superior or competitive performance in few-shot image classification settings (gains between 0.1% and 3.9% in accuracy). Analysis of the learned subspaces demonstrates that low-dimensional operations often yield high activation strengths, indicating that they may be important for achieving good few-shot learning performance. For reproducibility purposes, we publish all our research code publicly.

DNFS-VNE: Deep Neuro-Fuzzy System-Driven Virtual Network Embedding Algorithm

  • Authors: Authors: Ailing Xiao, Ning Chen, Sheng Wu, Shigen Shen, Weiping Ding, Peiying Zhang
  • Subjects: Networking and Internet Architecture (cs.NI); Signal Processing (eess.SP)
  • Arxiv link: https://arxiv.org/abs/2310.09078
  • Pdf link: https://arxiv.org/pdf/2310.09078
  • Abstract By decoupling substrate resources, network virtualization (NV) is a promising solution for meeting diverse demands and ensuring differentiated quality of service (QoS). In particular, virtual network embedding (VNE) is a critical enabling technology that enhances the flexibility and scalability of network deployment by addressing the coupling of Internet processes and services. However, in the existing works, the black-box nature of deep neural networks (DNNs) limits the analysis, development, and improvement of systems. In recent times, interpretable deep learning (DL) represented by deep neuro-fuzzy systems (DNFS) combined with fuzzy inference has shown promising interpretability to further exploit the hidden value in the data. Motivated by this, we propose a DNFS-based VNE algorithm that aims to provide an interpretable NV scheme. Specifically, data-driven convolutional neural networks (CNNs) are used as fuzzy implication operators to compute the embedding probabilities of candidate substrate nodes through entailment operations. And, the identified fuzzy rule patterns are cached into the weights by forward computation and gradient back-propagation (BP). In addition, the fuzzy rule base is constructed based on Mamdani-type linguistic rules using linguistic labels. Finally, the effectiveness of evaluation indicators and fuzzy rules is verified by experiments.

Insuring Smiles: Predicting routine dental coverage using Spark ML

  • Authors: Authors: Aishwarya Gupta, Rahul S. Bhogale, Priyanka Thota, Prathushkumar Dathuri, Jongwook Woo
  • Subjects: Machine Learning (cs.LG); Distributed, Parallel, and Cluster Computing (cs.DC)
  • Arxiv link: https://arxiv.org/abs/2310.09229
  • Pdf link: https://arxiv.org/pdf/2310.09229
  • Abstract Finding suitable health insurance coverage can be challenging for individuals and small enterprises in the USA. The Health Insurance Exchange Public Use Files (Exchange PUFs) dataset provided by CMS offers valuable information on health and dental policies [1]. In this paper, we leverage machine learning algorithms to predict if a health insurance plan covers routine dental services for adults. By analyzing plan type, region, deductibles, out-of-pocket maximums, and copayments, we employ Logistic Regression, Decision Tree, Random Forest, Gradient Boost, Factorization Model and Support Vector Machine algorithms. Our goal is to provide a clinical strategy for individuals and families to select the most suitable insurance plan based on income and expenses.

A lowest order stabilization-free mixed Virtual Element Method

  • Authors: Authors: Andrea Borio, Carlo Lovadina, Francesca Marcon, Michele Visinoni
  • Subjects: Numerical Analysis (math.NA)
  • Arxiv link: https://arxiv.org/abs/2310.09260
  • Pdf link: https://arxiv.org/pdf/2310.09260
  • Abstract We initiate the design and the analysis of stabilization-free Virtual Element Methods for the laplacian problem written in mixed form. A Virtual Element version of the lowest order Raviart-Thomas Finite Element is considered. To reduce the computational costs, a suitable projection on the gradients of harmonic polynomials is employed. A complete theoretical analysis of stability and convergence is developed in the case of quadrilateral meshes. Some numerical tests highlighting the actual behaviour of the scheme are also provided.

User Inference Attacks on Large Language Models

  • Authors: Authors: Nikhil Kandpal, Krishna Pillutla, Alina Oprea, Peter Kairouz, Christopher A. Choquette-Choo, Zheng Xu
  • Subjects: Cryptography and Security (cs.CR); Computation and Language (cs.CL); Machine Learning (cs.LG)
  • Arxiv link: https://arxiv.org/abs/2310.09266
  • Pdf link: https://arxiv.org/pdf/2310.09266
  • Abstract Fine-tuning is a common and effective method for tailoring large language models (LLMs) to specialized tasks and applications. In this paper, we study the privacy implications of fine-tuning LLMs on user data. To this end, we define a realistic threat model, called user inference, wherein an attacker infers whether or not a user's data was used for fine-tuning. We implement attacks for this threat model that require only a small set of samples from a user (possibly different from the samples used for training) and black-box access to the fine-tuned LLM. We find that LLMs are susceptible to user inference attacks across a variety of fine-tuning datasets, at times with near perfect attack success rates. Further, we investigate which properties make users vulnerable to user inference, finding that outlier users (i.e. those with data distributions sufficiently different from other users) and users who contribute large quantities of data are most susceptible to attack. Finally, we explore several heuristics for mitigating privacy attacks. We find that interventions in the training algorithm, such as batch or per-example gradient clipping and early stopping fail to prevent user inference. However, limiting the number of fine-tuning samples from a single user can reduce attack effectiveness, albeit at the cost of reducing the total amount of fine-tuning data.

Keyword: super-resolution

There is no result

zoq avatar Oct 16 '23 06:10 zoq