arxiv-updates
arxiv-updates copied to clipboard
New submissions for Fri, 29 Sep 23
Keyword: sgd
AutoEncoding Tree for City Generation and Applications
- Authors: Authors: Wenyu Han, Congcong Wen, Lazarus Chok, Yan Liang Tan, Sheung Lung Chan, Hang Zhao, Chen Feng
- Subjects: Computer Vision and Pattern Recognition (cs.CV)
- Arxiv link: https://arxiv.org/abs/2309.15941
- Pdf link: https://arxiv.org/pdf/2309.15941
- Abstract City modeling and generation have attracted an increased interest in various applications, including gaming, urban planning, and autonomous driving. Unlike previous works focused on the generation of single objects or indoor scenes, the huge volumes of spatial data in cities pose a challenge to the generative models. Furthermore, few publicly available 3D real-world city datasets also hinder the development of methods for city generation. In this paper, we first collect over 3,000,000 geo-referenced objects for the city of New York, Zurich, Tokyo, Berlin, Boston and several other large cities. Based on this dataset, we propose AETree, a tree-structured auto-encoder neural network, for city generation. Specifically, we first propose a novel Spatial-Geometric Distance (SGD) metric to measure the similarity between building layouts and then construct a binary tree over the raw geometric data of building based on the SGD metric. Next, we present a tree-structured network whose encoder learns to extract and merge spatial information from bottom-up iteratively. The resulting global representation is reversely decoded for reconstruction or generation. To address the issue of long-dependency as the level of the tree increases, a Long Short-Term Memory (LSTM) Cell is employed as a basic network element of the proposed AETree. Moreover, we introduce a novel metric, Overlapping Area Ratio (OAR), to quantitatively evaluate the generation results. Experiments on the collected dataset demonstrate the effectiveness of the proposed model on 2D and 3D city generation. Furthermore, the latent features learned by AETree can serve downstream urban planning applications.
Keyword: optimization
Accelerating Motion Planning via Optimal Transport
- Authors: Authors: An T. Le, Georgia Chalvatzaki, Armin Biess, Jan Peters
- Subjects: Robotics (cs.RO); Optimization and Control (math.OC)
- Arxiv link: https://arxiv.org/abs/2309.15970
- Pdf link: https://arxiv.org/pdf/2309.15970
- Abstract Motion planning is still an open problem for many disciplines, e.g., robotics, autonomous driving, due to issues like high planning times that hinder real-time, efficient decision-making. A class of methods striving to provide smooth solutions is gradient-based trajectory optimization. However, those methods might suffer from bad local minima, while for many settings, they may be inapplicable due to the absence of easy access to objectives-gradients. In response to these issues, we introduce Motion Planning via Optimal Transport (MPOT) - a gradient-free method that optimizes a batch of smooth trajectories over highly nonlinear costs, even for high-dimensional tasks, while imposing smoothness through a Gaussian Process dynamics prior via planning-as-inference perspective. To facilitate batch trajectory optimization, we introduce an original zero-order and highly-parallelizable update rule -- the Sinkhorn Step, which uses the regular polytope family for its search directions; each regular polytope, centered on trajectory waypoints, serves as a local neighborhood, effectively acting as a trust region, where the Sinkhorn Step "transports" local waypoints toward low-cost regions. We theoretically show that Sinkhorn Step guides the optimizing parameters toward local minima regions on non-convex objective functions. We then show the efficiency of MPOT in a range of problems from low-dimensional point-mass navigation to high-dimensional whole-body robot motion planning, evincing its superiority compared with popular motion planners and paving the way for new applications of optimal transport in motion planning.
GasMono: Geometry-Aided Self-Supervised Monocular Depth Estimation for Indoor Scenes
- Authors: Authors: Chaoqiang Zhao, Matteo Poggi, Fabio Tosi, Lei Zhou, Qiyu Sun, Yang Tang, Stefano Mattoccia
- Subjects: Computer Vision and Pattern Recognition (cs.CV)
- Arxiv link: https://arxiv.org/abs/2309.16019
- Pdf link: https://arxiv.org/pdf/2309.16019
- Abstract This paper tackles the challenges of self-supervised monocular depth estimation in indoor scenes caused by large rotation between frames and low texture. We ease the learning process by obtaining coarse camera poses from monocular sequences through multi-view geometry to deal with the former. However, we found that limited by the scale ambiguity across different scenes in the training dataset, a na"ive introduction of geometric coarse poses cannot play a positive role in performance improvement, which is counter-intuitive. To address this problem, we propose to refine those poses during training through rotation and translation/scale optimization. To soften the effect of the low texture, we combine the global reasoning of vision transformers with an overfitting-aware, iterative self-distillation mechanism, providing more accurate depth guidance coming from the network itself. Experiments on NYUv2, ScanNet, 7scenes, and KITTI datasets support the effectiveness of each component in our framework, which sets a new state-of-the-art for indoor self-supervised monocular depth estimation, as well as outstanding generalization ability. Code and models are available at https://github.com/zxcqlf/GasMono
Model Predictive Planning: Towards Real-Time Multi-Trajectory Planning Around Obstacles
- Authors: Authors: Matthew T. Wallace, Brett Streetman, Laurent Lessard
- Subjects: Systems and Control (eess.SY); Robotics (cs.RO)
- Arxiv link: https://arxiv.org/abs/2309.16024
- Pdf link: https://arxiv.org/pdf/2309.16024
- Abstract This paper presents a motion planning scheme we call Model Predictive Planning (MPP), designed to optimize trajectories through obstacle-laden environments. The approach involves path planning, trajectory refinement through the solution of a quadratic program, and real-time selection of optimal trajectories. The paper highlights three technical innovations: a raytracing-based path-to-trajectory refinement, the integration of this technique with a multi-path planner to overcome difficulties due to local minima, and a method to achieve timescale separation in trajectory optimization. The scheme is demonstrated through simulations on a 2D longitudinal aircraft model and shows strong obstacle avoidance performance.
Improving Adaptive Online Learning Using Refined Discretization
- Authors: Authors: Zhiyu Zhang, Heng Yang, Ashok Cutkosky, Ioannis Ch. Paschalidis
- Subjects: Machine Learning (cs.LG); Machine Learning (stat.ML)
- Arxiv link: https://arxiv.org/abs/2309.16044
- Pdf link: https://arxiv.org/pdf/2309.16044
- Abstract We study unconstrained Online Linear Optimization with Lipschitz losses. The goal is to simultaneously achieve ($i$) second order gradient adaptivity; and ($ii$) comparator norm adaptivity also known as "parameter freeness" in the literature. Existing regret bounds (Cutkosky and Orabona, 2018; Mhammedi and Koolen, 2020; Jacobsen and Cutkosky, 2022) have the suboptimal $O(\sqrt{V_T\log V_T})$ dependence on the gradient variance $V_T$, while the present work improves it to the optimal rate $O(\sqrt{V_T})$ using a novel continuous-time-inspired algorithm, without any impractical doubling trick. This result can be extended to the setting with unknown Lipschitz constant, eliminating the range ratio problem from prior works (Mhammedi and Koolen, 2020). Concretely, we first show that the aimed simultaneous adaptivity can be achieved fairly easily in a continuous time analogue of the problem, where the environment is modeled by an arbitrary continuous semimartingale. Then, our key innovation is a new discretization argument that preserves such adaptivity in the discrete time adversarial setting. This refines a non-gradient-adaptive discretization argument from (Harvey et al., 2023), both algorithmically and analytically, which could be of independent interest.
Analytic regularity of strong solutions for the complexified stochastic non-linear Poisson Boltzmann Equation
- Authors: Authors: Brian Choi, Jie Xu, Trevor Norton, Mark Kon, Julio Enrique Castrillon-Candas
- Subjects: Numerical Analysis (math.NA)
- Arxiv link: https://arxiv.org/abs/2309.16068
- Pdf link: https://arxiv.org/pdf/2309.16068
- Abstract Semi-linear elliptic Partial Differential Equations (PDEs) such as the non-linear Poisson Boltzmann Equation (nPBE) is highly relevant for non-linear electrostatics in computational biology and chemistry. It is of particular importance for modeling potential fields from molecules in solvents or plasmas with stochastic fluctuations. The extensive applications include ones in condensed matter and solid state physics, chemical physics, electrochemistry, biochemistry, thermodynamics, statistical mechanics, and materials science, among others. In this paper we study the complex analytic properties of semi-linear elliptic Partial Differential Equations with respect to random fluctuations on the domain. We first prove the existence and uniqueness of the nPBE on a bounded domain in $\mathbb{R}^3$. This proof relies on the application of a contraction mapping reasoning, as the standard convex optimization argument for the deterministic nPBE no longer applies. Using the existence and uniqueness result we subsequently show that solution to the nPBE admits an analytic extension onto a well defined region in the complex hyperplane with respect to the number of stochastic variables. Due to the analytic extension, stochastic collocation theory for sparse grids predict algebraic to sub-exponential convergence rates with respect to the number of knots. A series of numerical experiments with sparse grids is consistent with this prediction and the analyticity result. Finally, this approach readily extends to a wide class of semi-linear elliptic PDEs.
Generative Semi-supervised Learning with Meta-Optimized Synthetic Samples
- Authors: Authors: Shin'ya Yamaguchi
- Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (stat.ML)
- Arxiv link: https://arxiv.org/abs/2309.16143
- Pdf link: https://arxiv.org/pdf/2309.16143
- Abstract Semi-supervised learning (SSL) is a promising approach for training deep classification models using labeled and unlabeled datasets. However, existing SSL methods rely on a large unlabeled dataset, which may not always be available in many real-world applications due to legal constraints (e.g., GDPR). In this paper, we investigate the research question: Can we train SSL models without real unlabeled datasets? Instead of using real unlabeled datasets, we propose an SSL method using synthetic datasets generated from generative foundation models trained on datasets containing millions of samples in diverse domains (e.g., ImageNet). Our main concepts are identifying synthetic samples that emulate unlabeled samples from generative foundation models and training classifiers using these synthetic samples. To achieve this, our method is formulated as an alternating optimization problem: (i) meta-learning of generative foundation models and (ii) SSL of classifiers using real labeled and synthetic unlabeled samples. For (i), we propose a meta-learning objective that optimizes latent variables to generate samples that resemble real labeled samples and minimize the validation loss. For (ii), we propose a simple unsupervised loss function that regularizes the feature extractors of classifiers to maximize the performance improvement obtained from synthetic samples. We confirm that our method outperforms baselines using generative foundation models on SSL. We also demonstrate that our methods outperform SSL using real unlabeled datasets in scenarios with extremely small amounts of labeled datasets. This suggests that synthetic samples have the potential to provide improvement gains more efficiently than real unlabeled data.
BEVHeight++: Toward Robust Visual Centric 3D Object Detection
- Authors: Authors: Lei Yang, Tao Tang, Jun Li, Peng Chen, Kun Yuan, Li Wang, Yi Huang, Xinyu Zhang, Kaicheng Yu
- Subjects: Computer Vision and Pattern Recognition (cs.CV)
- Arxiv link: https://arxiv.org/abs/2309.16179
- Pdf link: https://arxiv.org/pdf/2309.16179
- Abstract While most recent autonomous driving system focuses on developing perception methods on ego-vehicle sensors, people tend to overlook an alternative approach to leverage intelligent roadside cameras to extend the perception ability beyond the visual range. We discover that the state-of-the-art vision-centric bird's eye view detection methods have inferior performances on roadside cameras. This is because these methods mainly focus on recovering the depth regarding the camera center, where the depth difference between the car and the ground quickly shrinks while the distance increases. In this paper, we propose a simple yet effective approach, dubbed BEVHeight++, to address this issue. In essence, we regress the height to the ground to achieve a distance-agnostic formulation to ease the optimization process of camera-only perception methods. By incorporating both height and depth encoding techniques, we achieve a more accurate and robust projection from 2D to BEV spaces. On popular 3D detection benchmarks of roadside cameras, our method surpasses all previous vision-centric methods by a significant margin. In terms of the ego-vehicle scenario, our BEVHeight++ possesses superior over depth-only methods. Specifically, it yields a notable improvement of +1.9% NDS and +1.1% mAP over BEVDepth when evaluated on the nuScenes validation set. Moreover, on the nuScenes test set, our method achieves substantial advancements, with an increase of +2.8% NDS and +1.7% mAP, respectively.
Beyond Reverse KL: Generalizing Direct Preference Optimization with Diverse Divergence Constraints
- Authors: Authors: Chaoqi Wang, Yibo Jiang, Chenghao Yang, Han Liu, Yuxin Chen
- Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Machine Learning (stat.ML)
- Arxiv link: https://arxiv.org/abs/2309.16240
- Pdf link: https://arxiv.org/pdf/2309.16240
- Abstract The increasing capabilities of large language models (LLMs) raise opportunities for artificial general intelligence but concurrently amplify safety concerns, such as potential misuse of AI systems, necessitating effective AI alignment. Reinforcement Learning from Human Feedback (RLHF) has emerged as a promising pathway towards AI alignment but brings forth challenges due to its complexity and dependence on a separate reward model. Direct Preference Optimization (DPO) has been proposed as an alternative, and it remains equivalent to RLHF under the reverse KL regularization constraint. This paper presents $f$-DPO, a generalized approach to DPO by incorporating diverse divergence constraints. We show that under certain $f$-divergences, including Jensen-Shannon divergence, forward KL divergences and $\alpha$-divergences, the complex relationship between the reward and optimal policy can also be simplified by addressing the Karush-Kuhn-Tucker conditions. This eliminates the need for estimating the normalizing constant in the Bradley-Terry model and enables a tractable mapping between the reward function and the optimal policy. Our approach optimizes LLMs to align with human preferences in a more efficient and supervised manner under a broad set of divergence constraints. Empirically, adopting these divergences ensures a balance between alignment performance and generation diversity. Importantly, $f$-DPO outperforms PPO-based methods in divergence efficiency, and divergence constraints directly influence expected calibration error (ECE).
Nondestructive chicken egg fertility detection using CNN-transfer learning algorithms
- Authors: Authors: Shoffan Saifullah, Rafal Drezewski, Anton Yudhana, Andri Pranolo, Wilis Kaswijanti, Andiko Putro Suryotomo, Seno Aji Putra, Alin Khaliduzzaman, Anton Satria Prabuwono, Nathalie Japkowicz
- Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Image and Video Processing (eess.IV)
- Arxiv link: https://arxiv.org/abs/2309.16257
- Pdf link: https://arxiv.org/pdf/2309.16257
- Abstract This study explored the application of CNN-Transfer Learning for nondestructive chicken egg fertility detection for precision poultry hatchery practices. Four models, VGG16, ResNet50, InceptionNet, and MobileNet, were trained and evaluated on a dataset (200 single egg images) using augmented images (rotation, flip, scale, translation, and reflection). Although the training results demonstrated that all models achieved high accuracy, indicating their ability to accurately learn and classify chicken eggs' fertility state, when evaluated on the testing set, variations in accuracy and performance were observed. InceptionNet exhibited the best overall performance, accurately classifying fertile and non-fertile eggs. It demonstrated excellent performance in both training and testing sets in all parameters of the evaluation metrics. In testing set, it achieved an accuracy of 0.98, a sensitivity of 1 for detecting fertile eggs, and a specificity of 0.96 for identifying non-fertile eggs. The higher performance is attributed to its unique architecture efficiently capturing features at different scales leading to improved accuracy and robustness. Further optimization and fine-tuning of the models might necessary to address the limitations in accurately detecting fertile and non-fertile eggs in case of other models. This study highlighted the potential of CNN-Transfer Learning for nondestructive fertility detection and emphasizes the need for further research to enhance the models' capabilities and ensure accurate classification.
Generalizable Heterogeneous Federated Cross-Correlation and Instance Similarity Learning
- Authors: Authors: Wenke Huang, Mang Ye, Zekun Shi, Bo Du
- Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI)
- Arxiv link: https://arxiv.org/abs/2309.16286
- Pdf link: https://arxiv.org/pdf/2309.16286
- Abstract Federated learning is an important privacy-preserving multi-party learning paradigm, involving collaborative learning with others and local updating on private data. Model heterogeneity and catastrophic forgetting are two crucial challenges, which greatly limit the applicability and generalizability. This paper presents a novel FCCL+, federated correlation and similarity learning with non-target distillation, facilitating the both intra-domain discriminability and inter-domain generalization. For heterogeneity issue, we leverage irrelevant unlabeled public data for communication between the heterogeneous participants. We construct cross-correlation matrix and align instance similarity distribution on both logits and feature levels, which effectively overcomes the communication barrier and improves the generalizable ability. For catastrophic forgetting in local updating stage, FCCL+ introduces Federated Non Target Distillation, which retains inter-domain knowledge while avoiding the optimization conflict issue, fulling distilling privileged inter-domain information through depicting posterior classes relation. Considering that there is no standard benchmark for evaluating existing heterogeneous federated learning under the same setting, we present a comprehensive benchmark with extensive representative methods under four domain shift scenarios, supporting both heterogeneous and homogeneous federated settings. Empirical results demonstrate the superiority of our method and the efficiency of modules on various scenarios.
Multi-scale Recurrent LSTM and Transformer Network for Depth Completion
- Authors: Authors: Xiaogang Jia, Yusong Tan, Songlei Jian, Yonggang Che
- Subjects: Computer Vision and Pattern Recognition (cs.CV)
- Arxiv link: https://arxiv.org/abs/2309.16301
- Pdf link: https://arxiv.org/pdf/2309.16301
- Abstract Lidar depth completion is a new and hot topic of depth estimation. In this task, it is the key and difficult point to fuse the features of color space and depth space. In this paper, we migrate the classic LSTM and Transformer modules from NLP to depth completion and redesign them appropriately. Specifically, we use Forget gate, Update gate, Output gate, and Skip gate to achieve the efficient fusion of color and depth features and perform loop optimization at multiple scales. Finally, we further fuse the deep features through the Transformer multi-head attention mechanism. Experimental results show that without repetitive network structure and post-processing steps, our method can achieve state-of-the-art performance by adding our modules to a simple encoder-decoder network structure. Our method ranks first on the current mainstream autonomous driving KITTI benchmark dataset. It can also be regarded as a backbone network for other methods, which likewise achieves state-of-the-art performance.
Weakly-Supervised Video Anomaly Detection with Snippet Anomalous Attention
- Authors: Authors: Yidan Fan, Yongxin Yu, Wenhuan Lu, Yahong Han
- Subjects: Computer Vision and Pattern Recognition (cs.CV)
- Arxiv link: https://arxiv.org/abs/2309.16309
- Pdf link: https://arxiv.org/pdf/2309.16309
- Abstract With a focus on abnormal events contained within untrimmed videos, there is increasing interest among researchers in video anomaly detection. Among different video anomaly detection scenarios, weakly-supervised video anomaly detection poses a significant challenge as it lacks frame-wise labels during the training stage, only relying on video-level labels as coarse supervision. Previous methods have made attempts to either learn discriminative features in an end-to-end manner or employ a twostage self-training strategy to generate snippet-level pseudo labels. However, both approaches have certain limitations. The former tends to overlook informative features at the snippet level, while the latter can be susceptible to noises. In this paper, we propose an Anomalous Attention mechanism for weakly-supervised anomaly detection to tackle the aforementioned problems. Our approach takes into account snippet-level encoded features without the supervision of pseudo labels. Specifically, our approach first generates snippet-level anomalous attention and then feeds it together with original anomaly scores into a Multi-branch Supervision Module. The module learns different areas of the video, including areas that are challenging to detect, and also assists the attention optimization. Experiments on benchmark datasets XDViolence and UCF-Crime verify the effectiveness of our method. Besides, thanks to the proposed snippet-level attention, we obtain a more precise anomaly localization.
EFFL: Egalitarian Fairness in Federated Learning for Mitigating Matthew Effect
- Authors: Authors: Jiashi Gao, Changwu Huang, Ming Tang, Shin Hwei Tan, Xin Yao, Xuetao Wei
- Subjects: Machine Learning (cs.LG)
- Arxiv link: https://arxiv.org/abs/2309.16338
- Pdf link: https://arxiv.org/pdf/2309.16338
- Abstract Recent advances in federated learning (FL) enable collaborative training of machine learning (ML) models from large-scale and widely dispersed clients while protecting their privacy. However, when different clients' datasets are heterogeneous, traditional FL mechanisms produce a global model that does not adequately represent the poorer clients with limited data resources, resulting in lower accuracy and higher bias on their local data. According to the Matthew effect, which describes how the advantaged gain more advantage and the disadvantaged lose more over time, deploying such a global model in client applications may worsen the resource disparity among the clients and harm the principles of social welfare and fairness. To mitigate the Matthew effect, we propose Egalitarian Fairness Federated Learning (EFFL), where egalitarian fairness refers to the global model learned from FL has: (1) equal accuracy among clients; (2) equal decision bias among clients. Besides achieving egalitarian fairness among the clients, EFFL also aims for performance optimality, minimizing the empirical risk loss and the bias for each client; both are essential for any ML model training, whether centralized or decentralized. We formulate EFFL as a constrained multi-constrained multi-objectives optimization (MCMOO) problem, with the decision bias and egalitarian fairness as constraints and the minimization of the empirical risk losses on all clients as multiple objectives to be optimized. We propose a gradient-based three-stage algorithm to obtain the Pareto optimal solutions within the constraint space. Extensive experiments demonstrate that EFFL outperforms other state-of-the-art FL algorithms in achieving a high-performance global model with enhanced egalitarian fairness among all clients.
QUIC on the Highway: Evaluating Performance on High-rate Links
- Authors: Authors: Benedikt Jaeger, Johannes Zirngibl, Marcel Kempf, Kevin Ploch, Georg Carle
- Subjects: Networking and Internet Architecture (cs.NI)
- Arxiv link: https://arxiv.org/abs/2309.16395
- Pdf link: https://arxiv.org/pdf/2309.16395
- Abstract QUIC is a new protocol standardized in 2021 designed to improve on the widely used TCP / TLS stack. The main goal is to speed up web traffic via HTTP, but it is also used in other areas like tunneling. Based on UDP it offers features like reliable in-order delivery, flow and congestion control, streambased multiplexing, and always-on encryption using TLS 1.3. Other than with TCP, QUIC implements all these features in user space, only requiring kernel interaction for UDP. While running in user space provides more flexibility, it profits less from efficiency and optimization within the kernel. Multiple implementations exist, differing in programming language, architecture, and design choices. This paper presents an extension to the QUIC Interop Runner, a framework for testing interoperability of QUIC implementations. Our contribution enables reproducible QUIC benchmarks on dedicated hardware. We provide baseline results on 10G links, including multiple implementations, evaluate how OS features like buffer sizes and NIC offloading impact QUIC performance, and show which data rates can be achieved with QUIC compared to TCP. Our results show that QUIC performance varies widely between client and server implementations from 90 Mbit/s to 4900 Mbit/s. We show that the OS generally sets the default buffer size too small, which should be increased by at least an order of magnitude based on our findings. Furthermore, QUIC benefits less from NIC offloading and AES NI hardware acceleration while both features improve the goodput of TCP to around 8000 Mbit/s. Our framework can be applied to evaluate the effects of future improvements to the protocol or the OS.
Genetic Engineering Algorithm (GEA): An Efficient Metaheuristic Algorithm for Solving Combinatorial Optimization Problems
- Authors: Authors: Majid Sohrabi, Amir M. Fathollahi-Fard, Vasilii A. Gromov
- Subjects: Neural and Evolutionary Computing (cs.NE); Artificial Intelligence (cs.AI)
- Arxiv link: https://arxiv.org/abs/2309.16413
- Pdf link: https://arxiv.org/pdf/2309.16413
- Abstract Genetic Algorithms (GAs) are known for their efficiency in solving combinatorial optimization problems, thanks to their ability to explore diverse solution spaces, handle various representations, exploit parallelism, preserve good solutions, adapt to changing dynamics, handle combinatorial diversity, and provide heuristic search. However, limitations such as premature convergence, lack of problem-specific knowledge, and randomness of crossover and mutation operators make GAs generally inefficient in finding an optimal solution. To address these limitations, this paper proposes a new metaheuristic algorithm called the Genetic Engineering Algorithm (GEA) that draws inspiration from genetic engineering concepts. GEA redesigns the traditional GA while incorporating new search methods to isolate, purify, insert, and express new genes based on existing ones, leading to the emergence of desired traits and the production of specific chromosomes based on the selected genes. Comparative evaluations against state-of-the-art algorithms on benchmark instances demonstrate the superior performance of GEA, showcasing its potential as an innovative and efficient solution for combinatorial optimization problems.
db-CBS: Discontinuity-Bounded Conflict-Based Search for Multi-Robot Kinodynamic Motion Planning
- Authors: Authors: Akmaral Moldagalieva, Joaquim Ortiz-Haro, Marc Toussaint, Wolfgang Hönig
- Subjects: Robotics (cs.RO)
- Arxiv link: https://arxiv.org/abs/2309.16445
- Pdf link: https://arxiv.org/pdf/2309.16445
- Abstract This paper presents a multi-robot kinodynamic motion planner that enables a team of robots with different dynamics, actuation limits, and shapes to reach their goals in challenging environments. We solve this problem by combining Conflict-Based Search (CBS), a multi-agent path finding method, and discontinuity-bounded A*, a single-robot kinodynamic motion planner. Our method, db-CBS, operates in three levels. Initially, we compute trajectories for individual robots using a graph search that allows bounded discontinuities between precomputed motion primitives. The second level identifies inter-robot collisions and resolves them by imposing constraints on the first level. The third and final level uses the resulting solution with discontinuities as an initial guess for a joint space trajectory optimization. The procedure is repeated with a reduced discontinuity bound. Our approach is anytime, probabilistically complete, asymptotically optimal, and finds near-optimal solutions quickly. Experimental results with robot dynamics such as unicycle, double integrator, and car with trailer in different settings show that our method is capable of solving challenging tasks with a higher success rate and lower cost than the existing state-of-the-art.
Diverse Target and Contribution Scheduling for Domain Generalization
- Authors: Authors: Shaocong Long, Qianyu Zhou, Chenhao Ying, Lizhuang Ma, Yuan Luo
- Subjects: Computer Vision and Pattern Recognition (cs.CV)
- Arxiv link: https://arxiv.org/abs/2309.16460
- Pdf link: https://arxiv.org/pdf/2309.16460
- Abstract Generalization under the distribution shift has been a great challenge in computer vision. The prevailing practice of directly employing the one-hot labels as the training targets in domain generalization~(DG) can lead to gradient conflicts, making it insufficient for capturing the intrinsic class characteristics and hard to increase the intra-class variation. Besides, existing methods in DG mostly overlook the distinct contributions of source (seen) domains, resulting in uneven learning from these domains. To address these issues, we firstly present a theoretical and empirical analysis of the existence of gradient conflicts in DG, unveiling the previously unexplored relationship between distribution shifts and gradient conflicts during the optimization process. In this paper, we present a novel perspective of DG from the empirical source domain's risk and propose a new paradigm for DG called Diverse Target and Contribution Scheduling (DTCS). DTCS comprises two innovative modules: Diverse Target Supervision (DTS) and Diverse Contribution Balance (DCB), with the aim of addressing the limitations associated with the common utilization of one-hot labels and equal contributions for source domains in DG. In specific, DTS employs distinct soft labels as training targets to account for various feature distributions across domains and thereby mitigates the gradient conflicts, and DCB dynamically balances the contributions of source domains by ensuring a fair decline in losses of different source domains. Extensive experiments with analysis on four benchmark datasets show that the proposed method achieves a competitive performance in comparison with the state-of-the-art approaches, demonstrating the effectiveness and advantages of the proposed DTCS.
Towards Poisoning Fair Representations
- Authors: Authors: Tianci Liu, Haoyu Wang, Feijie Wu, Hengtong Zhang, Pan Li, Lu Su, Jing Gao
- Subjects: Machine Learning (cs.LG)
- Arxiv link: https://arxiv.org/abs/2309.16487
- Pdf link: https://arxiv.org/pdf/2309.16487
- Abstract Fair machine learning seeks to mitigate model prediction bias against certain demographic subgroups such as elder and female. Recently, fair representation learning (FRL) trained by deep neural networks has demonstrated superior performance, whereby representations containing no demographic information are inferred from the data and then used as the input to classification or other downstream tasks. Despite the development of FRL methods, their vulnerability under data poisoning attack, a popular protocol to benchmark model robustness under adversarial scenarios, is under-explored. Data poisoning attacks have been developed for classical fair machine learning methods which incorporate fairness constraints into shallow-model classifiers. Nonetheless, these attacks fall short in FRL due to notably different fairness goals and model architectures. This work proposes the first data poisoning framework attacking FRL. We induce the model to output unfair representations that contain as much demographic information as possible by injecting carefully crafted poisoning samples into the training data. This attack entails a prohibitive bilevel optimization, wherefore an effective approximated solution is proposed. A theoretical analysis on the needed number of poisoning samples is derived and sheds light on defending against the attack. Experiments on benchmark fairness datasets and state-of-the-art fair representation learning models demonstrate the superiority of our attack.
From Complexity to Clarity: Analytical Expressions of Deep Neural Network Weights via Clifford's Geometric Algebra and Convexity
- Authors: Authors: Mert Pilanci
- Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Neural and Evolutionary Computing (cs.NE); Optimization and Control (math.OC); Machine Learning (stat.ML)
- Arxiv link: https://arxiv.org/abs/2309.16512
- Pdf link: https://arxiv.org/pdf/2309.16512
- Abstract In this paper, we introduce a novel analysis of neural networks based on geometric (Clifford) algebra and convex optimization. We show that optimal weights of deep ReLU neural networks are given by the wedge product of training samples when trained with standard regularized loss. Furthermore, the training problem reduces to convex optimization over wedge product features, which encode the geometric structure of the training dataset. This structure is given in terms of signed volumes of triangles and parallelotopes generated by data vectors. The convex problem finds a small subset of samples via $\ell_1$ regularization to discover only relevant wedge product features. Our analysis provides a novel perspective on the inner workings of deep neural networks and sheds light on the role of the hidden layers.
MotionLM: Multi-Agent Motion Forecasting as Language Modeling
- Authors: Authors: Ari Seff, Brian Cera, Dian Chen, Mason Ng, Aurick Zhou, Nigamaa Nayakanti, Khaled S. Refaat, Rami Al-Rfou, Benjamin Sapp
- Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG); Robotics (cs.RO)
- Arxiv link: https://arxiv.org/abs/2309.16534
- Pdf link: https://arxiv.org/pdf/2309.16534
- Abstract Reliable forecasting of the future behavior of road agents is a critical component to safe planning in autonomous vehicles. Here, we represent continuous trajectories as sequences of discrete motion tokens and cast multi-agent motion prediction as a language modeling task over this domain. Our model, MotionLM, provides several advantages: First, it does not require anchors or explicit latent variable optimization to learn multimodal distributions. Instead, we leverage a single standard language modeling objective, maximizing the average log probability over sequence tokens. Second, our approach bypasses post-hoc interaction heuristics where individual agent trajectory generation is conducted prior to interactive scoring. Instead, MotionLM produces joint distributions over interactive agent futures in a single autoregressive decoding process. In addition, the model's sequential factorization enables temporally causal conditional rollouts. The proposed approach establishes new state-of-the-art performance for multi-agent motion prediction on the Waymo Open Motion Dataset, ranking 1st on the interactive challenge leaderboard.
Compilation as a Defense: Enhancing DL Model Attack Robustness via Tensor Optimization
- Authors: Authors: Stefan Trawicki, William Hackett, Lewis Birch, Neeraj Suri, Peter Garraghan
- Subjects: Machine Learning (cs.LG); Cryptography and Security (cs.CR)
- Arxiv link: https://arxiv.org/abs/2309.16577
- Pdf link: https://arxiv.org/pdf/2309.16577
- Abstract Adversarial Machine Learning (AML) is a rapidly growing field of security research, with an often overlooked area being model attacks through side-channels. Previous works show such attacks to be serious threats, though little progress has been made on efficient remediation strategies that avoid costly model re-engineering. This work demonstrates a new defense against AML side-channel attacks using model compilation techniques, namely tensor optimization. We show relative model attack effectiveness decreases of up to 43% using tensor optimization, discuss the implications, and direction of future work.
A Physics Informed Machine Learning Method for Power System Model Parameter Optimization
- Authors: Authors: Georg Kordowich, Johann Jaeger
- Subjects: Systems and Control (eess.SY)
- Arxiv link: https://arxiv.org/abs/2309.16579
- Pdf link: https://arxiv.org/pdf/2309.16579
- Abstract This paper proposes a gradient descent based optimization method that relies on automatic differentiation for the computation of gradients. The method uses tools and techniques originally developed in the field of artificial neural networks and applies them to power system simulations. It can be used as a one-shot physics informed machine learning approach for the identification of uncertain power system simulation parameters. Additionally, it can optimize parameters with respect to a desired system behavior. The paper focuses on presenting the theoretical background and showing exemplary use-cases for both parameter identification and optimization using a single machine infinite busbar system. The results imply a generic applicability for a wide range of problems.
Text-to-3D using Gaussian Splatting
- Authors: Authors: Zilong Chen, Feng Wang, Huaping Liu
- Subjects: Computer Vision and Pattern Recognition (cs.CV)
- Arxiv link: https://arxiv.org/abs/2309.16585
- Pdf link: https://arxiv.org/pdf/2309.16585
- Abstract In this paper, we present Gaussian Splatting based text-to-3D generation (GSGEN), a novel approach for generating high-quality 3D objects. Previous methods suffer from inaccurate geometry and limited fidelity due to the absence of 3D prior and proper representation. We leverage 3D Gaussian Splatting, a recent state-of-the-art representation, to address existing shortcomings by exploiting the explicit nature that enables the incorporation of 3D prior. Specifically, our method adopts a progressive optimization strategy, which includes a geometry optimization stage and an appearance refinement stage. In geometry optimization, a coarse representation is established under a 3D geometry prior along with the ordinary 2D SDS loss, ensuring a sensible and 3D-consistent rough shape. Subsequently, the obtained Gaussians undergo an iterative refinement to enrich details. In this stage, we increase the number of Gaussians by compactness-based densification to enhance continuity and improve fidelity. With these designs, our approach can generate 3D content with delicate details and more accurate geometry. Extensive evaluations demonstrate the effectiveness of our method, especially for capturing high-frequency components. Video results are provided at https://gsgen3d.github.io. Our code is available at https://github.com/gsgen3d/gsgen
Transfer Learning for Bayesian Optimization on Heterogeneous Search Spaces
- Authors: Authors: Zhou Fan, Xinran Han, Zi Wang
- Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Machine Learning (stat.ML)
- Arxiv link: https://arxiv.org/abs/2309.16597
- Pdf link: https://arxiv.org/pdf/2309.16597
- Abstract Bayesian optimization (BO) is a popular black-box function optimization method, which makes sequential decisions based on a Bayesian model, typically a Gaussian process (GP), of the function. To ensure the quality of the model, transfer learning approaches have been developed to automatically design GP priors by learning from observations on "training" functions. These training functions are typically required to have the same domain as the "test" function (black-box function to be optimized). In this paper, we introduce MPHD, a model pre-training method on heterogeneous domains, which uses a neural net mapping from domain-specific contexts to specifications of hierarchical GPs. MPHD can be seamlessly integrated with BO to transfer knowledge across heterogeneous search spaces. Our theoretical and empirical results demonstrate the validity of MPHD and its superior performance on challenging black-box function optimization tasks.
Adaptive Output-Feedback Model Predictive Control of Hammerstein Systems with Unknown Linear Dynamics
- Authors: Authors: Mohammadreza Kamaldar, Dennis S. Bernstein
- Subjects: Systems and Control (eess.SY); Dynamical Systems (math.DS); Optimization and Control (math.OC)
- Arxiv link: https://arxiv.org/abs/2309.16617
- Pdf link: https://arxiv.org/pdf/2309.16617
- Abstract This paper considers model predictive control of Hammerstein systems, where the linear dynamics are a priori unknown and the input nonlinearity is known. Predictive cost adaptive control (PCAC) is applied to this system using recursive least squares for online, closed-loop system identification with optimization over a receding horizon performed by quadratic programming (QP). In order to account for the input nonlinearity, the input matrix is defined to be control dependent, and the optimization is performed iteratively. This technique is applied to output stabilization of a chain of integrators with unknown dynamics under control saturation and deadzone input nonlinearity.
On Learning with LAD
- Authors: Authors: C. A. Jothishwaran, Biplav Srivastava, Jitin Singla, Sugata Gangopadhyay
- Subjects: Machine Learning (cs.LG)
- Arxiv link: https://arxiv.org/abs/2309.16630
- Pdf link: https://arxiv.org/pdf/2309.16630
- Abstract The logical analysis of data, LAD, is a technique that yields two-class classifiers based on Boolean functions having disjunctive normal form (DNF) representation. Although LAD algorithms employ optimization techniques, the resulting binary classifiers or binary rules do not lead to overfitting. We propose a theoretical justification for the absence of overfitting by estimating the Vapnik-Chervonenkis dimension (VC dimension) for LAD models where hypothesis sets consist of DNFs with a small number of cubic monomials. We illustrate and confirm our observations empirically.
Sparse Submodular Function Minimization
- Authors: Authors: Andrei Graur, Haotian Jiang, Aaron Sidford
- Subjects: Data Structures and Algorithms (cs.DS); Optimization and Control (math.OC)
- Arxiv link: https://arxiv.org/abs/2309.16632
- Pdf link: https://arxiv.org/pdf/2309.16632
- Abstract In this paper we study the problem of minimizing a submodular function $f : 2^V \rightarrow \mathbb{R}$ that is guaranteed to have a $k$-sparse minimizer. We give a deterministic algorithm that computes an additive $\epsilon$-approximate minimizer of such $f$ in $\widetilde{O}(\mathsf{poly}(k) \log(|f|/\epsilon))$ parallel depth using a polynomial number of queries to an evaluation oracle of $f$, where $|f| = \max_{S \subseteq V} |f(S)|$. Further, we give a randomized algorithm that computes an exact minimizer of $f$ with high probability using $\widetilde{O}(|V| \cdot \mathsf{poly}(k))$ queries and polynomial time. When $k = \widetilde{O}(1)$, our algorithms use either nearly-constant parallel depth or a nearly-linear number of evaluation oracle queries. All previous algorithms for this problem either use $\Omega(|V|)$ parallel depth or $\Omega(|V|^2)$ queries. In contrast to state-of-the-art weakly-polynomial and strongly-polynomial time algorithms for SFM, our algorithms use first-order optimization methods, e.g., mirror descent and follow the regularized leader. We introduce what we call {\em sparse dual certificates}, which encode information on the structure of sparse minimizers, and both our parallel and sequential algorithms provide new algorithmic tools for allowing first-order optimization methods to efficiently compute them. Correspondingly, our algorithm does not invoke fast matrix multiplication or general linear system solvers and in this sense is more combinatorial than previous state-of-the-art methods.
DreamGaussian: Generative Gaussian Splatting for Efficient 3D Content Creation
- Authors: Authors: Jiaxiang Tang, Jiawei Ren, Hang Zhou, Ziwei Liu, Gang Zeng
- Subjects: Computer Vision and Pattern Recognition (cs.CV)
- Arxiv link: https://arxiv.org/abs/2309.16653
- Pdf link: https://arxiv.org/pdf/2309.16653
- Abstract Recent advances in 3D content creation mostly leverage optimization-based 3D generation via score distillation sampling (SDS). Though promising results have been exhibited, these methods often suffer from slow per-sample optimization, limiting their practical usage. In this paper, we propose DreamGaussian, a novel 3D content generation framework that achieves both efficiency and quality simultaneously. Our key insight is to design a generative 3D Gaussian Splatting model with companioned mesh extraction and texture refinement in UV space. In contrast to the occupancy pruning used in Neural Radiance Fields, we demonstrate that the progressive densification of 3D Gaussians converges significantly faster for 3D generative tasks. To further enhance the texture quality and facilitate downstream applications, we introduce an efficient algorithm to convert 3D Gaussians into textured meshes and apply a fine-tuning stage to refine the details. Extensive experiments demonstrate the superior efficiency and competitive generation quality of our proposed approach. Notably, DreamGaussian produces high-quality textured meshes in just 2 minutes from a single-view image, achieving approximately 10 times acceleration compared to existing methods.
Keyword: adam
There is no result
Keyword: gradient
Accelerating Motion Planning via Optimal Transport
- Authors: Authors: An T. Le, Georgia Chalvatzaki, Armin Biess, Jan Peters
- Subjects: Robotics (cs.RO); Optimization and Control (math.OC)
- Arxiv link: https://arxiv.org/abs/2309.15970
- Pdf link: https://arxiv.org/pdf/2309.15970
- Abstract Motion planning is still an open problem for many disciplines, e.g., robotics, autonomous driving, due to issues like high planning times that hinder real-time, efficient decision-making. A class of methods striving to provide smooth solutions is gradient-based trajectory optimization. However, those methods might suffer from bad local minima, while for many settings, they may be inapplicable due to the absence of easy access to objectives-gradients. In response to these issues, we introduce Motion Planning via Optimal Transport (MPOT) - a gradient-free method that optimizes a batch of smooth trajectories over highly nonlinear costs, even for high-dimensional tasks, while imposing smoothness through a Gaussian Process dynamics prior via planning-as-inference perspective. To facilitate batch trajectory optimization, we introduce an original zero-order and highly-parallelizable update rule -- the Sinkhorn Step, which uses the regular polytope family for its search directions; each regular polytope, centered on trajectory waypoints, serves as a local neighborhood, effectively acting as a trust region, where the Sinkhorn Step "transports" local waypoints toward low-cost regions. We theoretically show that Sinkhorn Step guides the optimizing parameters toward local minima regions on non-convex objective functions. We then show the efficiency of MPOT in a range of problems from low-dimensional point-mass navigation to high-dimensional whole-body robot motion planning, evincing its superiority compared with popular motion planners and paving the way for new applications of optimal transport in motion planning.
Statistical CSI Based Beamforming for Reconfigurable Intelligent Surface Aided MISO Systems with Channel Correlation
- Authors: Authors: Haochen Li, Zhiwen Pan, Bin Wang, Nan Liu, Xiaohu You
- Subjects: Information Theory (cs.IT); Signal Processing (eess.SP)
- Arxiv link: https://arxiv.org/abs/2309.16026
- Pdf link: https://arxiv.org/pdf/2309.16026
- Abstract Reconfigurable intelligent surface (RIS) is a promising candidate technology of the upcoming Sixth Generation (6G) communication system for its ability to provide unprecedented spectral and energy efficiency increment through passive beamforming. However, it is challenging to obtain instantaneous channel state information (I-CSI) for RIS, which obliges us to use statistical channel state information (S-CSI) to achieve passive beamforming. In this paper, RIS-aided multiple-input single-output (MISO) multi-user downlink communication system with correlated channels is investigated. Then, we formulate the problem of joint beamforming design at the AP and RIS to maximize the sum ergodic spectral efficiency (ESE) of all users to improve the network capacity. Since it is too hard to compute sum ESE, an ESE approximation is adopted to reformulate the problem into a more tractable form. Then, we present two joint beamforming algorithms, namely the singular value decomposition-gradient descent (SVD-GD) algorithm and the fractional programming-gradient descent (FP-GD) algorithm. Simulation results show the effectiveness of our proposed algorithms and validate that 2-bits quantizer is enough for RIS phase shifts implementation.
Improving Adaptive Online Learning Using Refined Discretization
- Authors: Authors: Zhiyu Zhang, Heng Yang, Ashok Cutkosky, Ioannis Ch. Paschalidis
- Subjects: Machine Learning (cs.LG); Machine Learning (stat.ML)
- Arxiv link: https://arxiv.org/abs/2309.16044
- Pdf link: https://arxiv.org/pdf/2309.16044
- Abstract We study unconstrained Online Linear Optimization with Lipschitz losses. The goal is to simultaneously achieve ($i$) second order gradient adaptivity; and ($ii$) comparator norm adaptivity also known as "parameter freeness" in the literature. Existing regret bounds (Cutkosky and Orabona, 2018; Mhammedi and Koolen, 2020; Jacobsen and Cutkosky, 2022) have the suboptimal $O(\sqrt{V_T\log V_T})$ dependence on the gradient variance $V_T$, while the present work improves it to the optimal rate $O(\sqrt{V_T})$ using a novel continuous-time-inspired algorithm, without any impractical doubling trick. This result can be extended to the setting with unknown Lipschitz constant, eliminating the range ratio problem from prior works (Mhammedi and Koolen, 2020). Concretely, we first show that the aimed simultaneous adaptivity can be achieved fairly easily in a continuous time analogue of the problem, where the environment is modeled by an arbitrary continuous semimartingale. Then, our key innovation is a new discretization argument that preserves such adaptivity in the discrete time adversarial setting. This refines a non-gradient-adaptive discretization argument from (Harvey et al., 2023), both algorithmically and analytically, which could be of independent interest.
Hybrid Digital-Wave Domain Channel Estimator for Stacked Intelligent Metasurface Enabled Multi-User MISO Systems
- Authors: Authors: Qurrat-Ul-Ain Nadeem, Jiancheng An, Anas Chaaban
- Subjects: Information Theory (cs.IT); Signal Processing (eess.SP)
- Arxiv link: https://arxiv.org/abs/2309.16204
- Pdf link: https://arxiv.org/pdf/2309.16204
- Abstract Stacked intelligent metasurface (SIM) is an emerging programmable metasurface architecture that can implement signal processing directly in the electromagnetic wave domain, thereby enabling efficient implementation of ultra-massive multiple-input multiple-output (MIMO) transceivers with a limited number of radio frequency (RF) chains. Channel estimation (CE) is challenging for SIM-enabled communication systems due to the multi-layer architecture of SIM, and because we need to estimate large dimensional channels between the SIM and users with a limited number of RF chains. To efficiently solve this problem, we develop a novel hybrid digital-wave domain channel estimator, in which the received training symbols are first processed in the wave domain within the SIM layers, and then processed in the digital domain. The wave domain channel estimator, parametrized by the phase shifts applied by the meta-atoms in all layers, is optimized to minimize the mean squared error (MSE) using a gradient descent algorithm, within which the digital part is optimally updated. For an SIM-enabled multi-user system equipped with 4 RF chains and a 6-layer SIM with 64 meta-atoms each, the proposed estimator yields an MSE that is very close to that achieved by fully digital CE in a massive MIMO system employing 64 RF chains. This high CE accuracy is achieved at the cost of a training overhead that can be reduced by exploiting the potential low rank of channel correlation matrices.
GInX-Eval: Towards In-Distribution Evaluation of Graph Neural Network Explanations
- Authors: Authors: Kenza Amara, Mennatallah El-Assady, Rex Ying
- Subjects: Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
- Arxiv link: https://arxiv.org/abs/2309.16223
- Pdf link: https://arxiv.org/pdf/2309.16223
- Abstract Diverse explainability methods of graph neural networks (GNN) have recently been developed to highlight the edges and nodes in the graph that contribute the most to the model predictions. However, it is not clear yet how to evaluate the correctness of those explanations, whether it is from a human or a model perspective. One unaddressed bottleneck in the current evaluation procedure is the problem of out-of-distribution explanations, whose distribution differs from those of the training data. This important issue affects existing evaluation metrics such as the popular faithfulness or fidelity score. In this paper, we show the limitations of faithfulness metrics. We propose GInX-Eval (Graph In-distribution eXplanation Evaluation), an evaluation procedure of graph explanations that overcomes the pitfalls of faithfulness and offers new insights on explainability methods. Using a retraining strategy, the GInX score measures how informative removed edges are for the model and the EdgeRank score evaluates if explanatory edges are correctly ordered by their importance. GInX-Eval verifies if ground-truth explanations are instructive to the GNN model. In addition, it shows that many popular methods, including gradient-based methods, produce explanations that are not better than a random designation of edges as important subgraphs, challenging the findings of current works in the area. Results with GInX-Eval are consistent across multiple datasets and align with human evaluation.
EFFL: Egalitarian Fairness in Federated Learning for Mitigating Matthew Effect
- Authors: Authors: Jiashi Gao, Changwu Huang, Ming Tang, Shin Hwei Tan, Xin Yao, Xuetao Wei
- Subjects: Machine Learning (cs.LG)
- Arxiv link: https://arxiv.org/abs/2309.16338
- Pdf link: https://arxiv.org/pdf/2309.16338
- Abstract Recent advances in federated learning (FL) enable collaborative training of machine learning (ML) models from large-scale and widely dispersed clients while protecting their privacy. However, when different clients' datasets are heterogeneous, traditional FL mechanisms produce a global model that does not adequately represent the poorer clients with limited data resources, resulting in lower accuracy and higher bias on their local data. According to the Matthew effect, which describes how the advantaged gain more advantage and the disadvantaged lose more over time, deploying such a global model in client applications may worsen the resource disparity among the clients and harm the principles of social welfare and fairness. To mitigate the Matthew effect, we propose Egalitarian Fairness Federated Learning (EFFL), where egalitarian fairness refers to the global model learned from FL has: (1) equal accuracy among clients; (2) equal decision bias among clients. Besides achieving egalitarian fairness among the clients, EFFL also aims for performance optimality, minimizing the empirical risk loss and the bias for each client; both are essential for any ML model training, whether centralized or decentralized. We formulate EFFL as a constrained multi-constrained multi-objectives optimization (MCMOO) problem, with the decision bias and egalitarian fairness as constraints and the minimization of the empirical risk losses on all clients as multiple objectives to be optimized. We propose a gradient-based three-stage algorithm to obtain the Pareto optimal solutions within the constraint space. Extensive experiments demonstrate that EFFL outperforms other state-of-the-art FL algorithms in achieving a high-performance global model with enhanced egalitarian fairness among all clients.
Diverse Target and Contribution Scheduling for Domain Generalization
- Authors: Authors: Shaocong Long, Qianyu Zhou, Chenhao Ying, Lizhuang Ma, Yuan Luo
- Subjects: Computer Vision and Pattern Recognition (cs.CV)
- Arxiv link: https://arxiv.org/abs/2309.16460
- Pdf link: https://arxiv.org/pdf/2309.16460
- Abstract Generalization under the distribution shift has been a great challenge in computer vision. The prevailing practice of directly employing the one-hot labels as the training targets in domain generalization~(DG) can lead to gradient conflicts, making it insufficient for capturing the intrinsic class characteristics and hard to increase the intra-class variation. Besides, existing methods in DG mostly overlook the distinct contributions of source (seen) domains, resulting in uneven learning from these domains. To address these issues, we firstly present a theoretical and empirical analysis of the existence of gradient conflicts in DG, unveiling the previously unexplored relationship between distribution shifts and gradient conflicts during the optimization process. In this paper, we present a novel perspective of DG from the empirical source domain's risk and propose a new paradigm for DG called Diverse Target and Contribution Scheduling (DTCS). DTCS comprises two innovative modules: Diverse Target Supervision (DTS) and Diverse Contribution Balance (DCB), with the aim of addressing the limitations associated with the common utilization of one-hot labels and equal contributions for source domains in DG. In specific, DTS employs distinct soft labels as training targets to account for various feature distributions across domains and thereby mitigates the gradient conflicts, and DCB dynamically balances the contributions of source domains by ensuring a fair decline in losses of different source domains. Extensive experiments with analysis on four benchmark datasets show that the proposed method achieves a competitive performance in comparison with the state-of-the-art approaches, demonstrating the effectiveness and advantages of the proposed DTCS.
A Physics Informed Machine Learning Method for Power System Model Parameter Optimization
- Authors: Authors: Georg Kordowich, Johann Jaeger
- Subjects: Systems and Control (eess.SY)
- Arxiv link: https://arxiv.org/abs/2309.16579
- Pdf link: https://arxiv.org/pdf/2309.16579
- Abstract This paper proposes a gradient descent based optimization method that relies on automatic differentiation for the computation of gradients. The method uses tools and techniques originally developed in the field of artificial neural networks and applies them to power system simulations. It can be used as a one-shot physics informed machine learning approach for the identification of uncertain power system simulation parameters. Additionally, it can optimize parameters with respect to a desired system behavior. The paper focuses on presenting the theoretical background and showing exemplary use-cases for both parameter identification and optimization using a single machine infinite busbar system. The results imply a generic applicability for a wide range of problems.
Revisiting Neural Program Smoothing for Fuzzing
- Authors: Authors: Maria-Irina Nicolae, Max Eisele, Andreas Zeller
- Subjects: Software Engineering (cs.SE); Artificial Intelligence (cs.AI); Cryptography and Security (cs.CR)
- Arxiv link: https://arxiv.org/abs/2309.16618
- Pdf link: https://arxiv.org/pdf/2309.16618
- Abstract Testing with randomly generated inputs (fuzzing) has gained significant traction due to its capacity to expose program vulnerabilities automatically. Fuzz testing campaigns generate large amounts of data, making them ideal for the application of machine learning (ML). Neural program smoothing (NPS), a specific family of ML-guided fuzzers, aims to use a neural network as a smooth approximation of the program target for new test case generation. In this paper, we conduct the most extensive evaluation of NPS fuzzers against standard gray-box fuzzers (>11 CPU years and >5.5 GPU years), and make the following contributions: (1) We find that the original performance claims for NPS fuzzers do not hold; a gap we relate to fundamental, implementation, and experimental limitations of prior works. (2) We contribute the first in-depth analysis of the contribution of machine learning and gradient-based mutations in NPS. (3) We implement Neuzz++, which shows that addressing the practical limitations of NPS fuzzers improves performance, but that standard gray-box fuzzers almost always surpass NPS-based fuzzers. (4) As a consequence, we propose new guidelines targeted at benchmarking fuzzing based on machine learning, and present MLFuzz, a platform with GPU access for easy and reproducible evaluation of ML-based fuzzers. Neuzz++, MLFuzz, and all our data are public.
Keyword: super-resolution
There is no result