arxiv-updates
arxiv-updates copied to clipboard
New submissions for Wed, 15 Nov 23
Keyword: sgd
Sparsity-Preserving Differentially Private Training of Large Embedding Models
- Authors: Authors: Badih Ghazi, Yangsibo Huang, Pritish Kamath, Ravi Kumar, Pasin Manurangsi, Amer Sinha, Chiyuan Zhang
- Subjects: Machine Learning (cs.LG); Cryptography and Security (cs.CR)
- Arxiv link: https://arxiv.org/abs/2311.08357
- Pdf link: https://arxiv.org/pdf/2311.08357
- Abstract As the use of large embedding models in recommendation systems and language applications increases, concerns over user data privacy have also risen. DP-SGD, a training algorithm that combines differential privacy with stochastic gradient descent, has been the workhorse in protecting user privacy without compromising model accuracy by much. However, applying DP-SGD naively to embedding models can destroy gradient sparsity, leading to reduced training efficiency. To address this issue, we present two new algorithms, DP-FEST and DP-AdaFEST, that preserve gradient sparsity during private training of large embedding models. Our algorithms achieve substantial reductions ($10^6 \times$) in gradient size, while maintaining comparable levels of accuracy, on benchmark real-world datasets.
Keyword: optimization
Modeling Choice via Self-Attention
- Authors: Authors: Joohwan Ko, Andrew A. Li
- Subjects: Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
- Arxiv link: https://arxiv.org/abs/2311.07607
- Pdf link: https://arxiv.org/pdf/2311.07607
- Abstract Models of choice are a fundamental input to many now-canonical optimization problems in the field of Operations Management, including assortment, inventory, and price optimization. Naturally, accurate estimation of these models from data is a critical step in the application of these optimization problems in practice, and so it is perhaps surprising that such choice estimation has to now been accomplished almost exclusively, both in theory and in practice, (a) without the use of deep learning in any meaningful way, and (b) via evaluation on limited data with constantly-changing metrics. This is in stark contrast to the vast majority of similar learning applications, for which the practice of machine learning suggests that (a) neural network-based models are typically state-of-the-art, and (b) strict standardization on evaluation procedures (datasets, metrics, etc.) is crucial. Thus motivated, we first propose a choice model that is the first to successfully (both theoretically and practically) leverage a modern neural network architectural concept (self-attention). Theoretically, we show that our attention-based choice model is a low-rank generalization of the Halo Multinomial Logit model, a recent model that parsimoniously captures irrational choice effects and has seen empirical success. We prove that whereas the Halo-MNL requires $\Omega(m^2)$ data samples to estimate, where $m$ is the number of products, our model supports a natural nonconvex estimator (in particular, that which a standard neural network implementation would apply) which admits a near-optimal stationary point with $O(m)$ samples. We then establish the first realistic-scale benchmark for choice estimation on real data and use this benchmark to run the largest evaluation of existing choice models to date. We find that the model we propose is dominant over both short-term and long-term data periods.
Rethinking and Benchmarking Predict-then-Optimize Paradigm for Combinatorial Optimization Problems
- Authors: Authors: Haoyu Geng, Han Ruan, Runzhong Wang, Yang Li, Yang Wang, Lei Chen, Junchi Yan
- Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Optimization and Control (math.OC)
- Arxiv link: https://arxiv.org/abs/2311.07633
- Pdf link: https://arxiv.org/pdf/2311.07633
- Abstract Numerous web applications rely on solving combinatorial optimization problems, such as energy cost-aware scheduling, budget allocation on web advertising, and graph matching on social networks. However, many optimization problems involve unknown coefficients, and improper predictions of these factors may lead to inferior decisions which may cause energy wastage, inefficient resource allocation, inappropriate matching in social networks, etc. Such a research topic is referred to as "Predict-Then-Optimize (PTO)" which considers the performance of prediction and decision-making in a unified system. A noteworthy recent development is the end-to-end methods by directly optimizing the ultimate decision quality which claims to yield better results in contrast to the traditional two-stage approach. However, the evaluation benchmarks in this field are fragmented and the effectiveness of various models in different scenarios remains unclear, hindering the comprehensive assessment and fast deployment of these methods. To address these issues, we provide a comprehensive categorization of current approaches and integrate existing experimental scenarios to establish a unified benchmark, elucidating the circumstances under which end-to-end training yields improvements, as well as the contexts in which it performs ineffectively. We also introduce a new dataset for the industrial combinatorial advertising problem for inclusive finance to open-source. We hope the rethinking and benchmarking of PTO could facilitate more convenient evaluation and deployment, and inspire further improvements both in the academy and industry within this field.
Near-Field Integrated Sensing, Positioning, and Communication: A Downlink and Uplink Framework
- Authors: Authors: Haochen Li, Zhaolin Wang, Xidong Mu, Zhiwen Pan, Yuanwei Liu
- Subjects: Information Theory (cs.IT); Signal Processing (eess.SP)
- Arxiv link: https://arxiv.org/abs/2311.07722
- Pdf link: https://arxiv.org/pdf/2311.07722
- Abstract A near-field integrated sensing, positioning, and communication (ISPAC) framework is proposed, where a base station (BS) simultaneously serves multiple communication users and carries out target sensing and positioning. A novel double-array structure is proposed to enable the near-field ISPAC at the BS. Specifically, a small-scale assisting transceiver (AT) is attached to the large-scale main transceiver (MT) to empower the communication system with the ability of sensing and positioning. Based on the proposed framework, the joint angle and distance Cram'er-Rao bound (CRB) is first derived. Then, the CRB is minimized subject to the minimum communication rate requirement in both downlink and uplink ISPAC scenarios: 1) For downlink ISPAC, a downlink target positioning algorithm is proposed and a penalty dual decomposition (PDD)-based double-loop algorithm is developed to tackle the non-convex optimization problem. 2) For uplink ISPAC, an uplink target positioning algorithm is proposed and an efficient alternating optimization algorithm is conceived to solve the non-convex CRB minimization problem with coupled user communication and target probing design. Both proposed optimization algorithms can converge to a stationary point of the CRB minimization problem. Numerical results show that: 1) The proposed ISPAC system can locate the target in both angle and distance domains merely relying on single BS and limited bandwidths; and 2) the positioning performance achieved by the hybrid-analog-and-digital ISPAC approaches that achieved by fully digital ISPAC when the communication rate requirement is not stringent.
In-context Learning and Gradient Descent Revisited
- Authors: Authors: Tomer Bar Nathan, Gilad Deutch, Nadav Magar, Guy Dar
- Subjects: Computation and Language (cs.CL); Machine Learning (cs.LG)
- Arxiv link: https://arxiv.org/abs/2311.07772
- Pdf link: https://arxiv.org/pdf/2311.07772
- Abstract In-context learning (ICL) has shown impressive results in few-shot learning tasks, yet its underlying mechanism is still not fully understood. Recent works suggest that ICL can be thought of as a gradient descent (GD) based optimization process. While promising, these results mainly focus on simplified settings of ICL and provide only a preliminary evaluation of the similarities between the two methods. In this work, we revisit the comparison between ICL and GD-based finetuning and study what properties of ICL an equivalent process must follow. We highlight a major difference in the flow of information between ICL and standard finetuning. Namely, ICL can only rely on information from lower layers at every point, while finetuning depends on loss gradients from deeper layers. We refer to this discrepancy as Layer Causality and show that a layer causal variant of the finetuning process aligns with ICL on par with vanilla finetuning and is even better in most cases across relevant metrics. To the best of our knowledge, this is the first work to discuss this discrepancy explicitly and suggest a solution that tackles this problem with minimal changes.
Size-Aware Hypergraph Motifs
- Authors: Authors: Jason Niu, Ilya D. Amburg, Sinan G. Aksoy, Ahmet Erdem Sarıyüce
- Subjects: Discrete Mathematics (cs.DM); Social and Information Networks (cs.SI); Data Analysis, Statistics and Probability (physics.data-an); Physics and Society (physics.soc-ph)
- Arxiv link: https://arxiv.org/abs/2311.07783
- Pdf link: https://arxiv.org/pdf/2311.07783
- Abstract Complex systems frequently exhibit multi-way, rather than pairwise, interactions. These group interactions cannot be faithfully modeled as collections of pairwise interactions using graphs, and instead require hypergraphs. However, methods that analyze hypergraphs directly, rather than via lossy graph reductions, remain limited. Hypergraph motif mining holds promise in this regard, as motif patterns serve as building blocks for larger group interactions which are inexpressible by graphs. Recent work has focused on categorizing and counting hypergraph motifs based on the existence of nodes in hyperedge intersection regions. Here, we argue that the relative sizes of hyperedge intersections within motifs contain varied and valuable information. We propose a suite of efficient algorithms for finding triplets of hyperedges based on optimizing the sizes of these intersection patterns. This formulation uncovers interesting local patterns of interaction, finding hyperedge triplets that either (1) are the least correlated with each other, (2) have the highest pairwise but not groupwise correlation, or (3) are the most correlated with each other. We formalize this as a combinatorial optimization problem and design efficient algorithms based on filtering hyperedges. Our experimental evaluation shows that the resulting hyperedge triplets yield insightful information on real-world hypergraphs. Our approach is also orders of magnitude faster than a naive baseline implementation.
Leveraging Hamilton-Jacobi PDEs with time-dependent Hamiltonians for continual scientific machine learning
- Authors: Authors: Paula Chen, Tingwei Meng, Zongren Zou, Jérôme Darbon, George Em Karniadakis
- Subjects: Machine Learning (cs.LG); Optimization and Control (math.OC)
- Arxiv link: https://arxiv.org/abs/2311.07790
- Pdf link: https://arxiv.org/pdf/2311.07790
- Abstract We address two major challenges in scientific machine learning (SciML): interpretability and computational efficiency. We increase the interpretability of certain learning processes by establishing a new theoretical connection between optimization problems arising from SciML and a generalized Hopf formula, which represents the viscosity solution to a Hamilton-Jacobi partial differential equation (HJ PDE) with time-dependent Hamiltonian. Namely, we show that when we solve certain regularized learning problems with integral-type losses, we actually solve an optimal control problem and its associated HJ PDE with time-dependent Hamiltonian. This connection allows us to reinterpret incremental updates to learned models as the evolution of an associated HJ PDE and optimal control problem in time, where all of the previous information is intrinsically encoded in the solution to the HJ PDE. As a result, existing HJ PDE solvers and optimal control algorithms can be reused to design new efficient training approaches for SciML that naturally coincide with the continual learning framework, while avoiding catastrophic forgetting. As a first exploration of this connection, we consider the special case of linear regression and leverage our connection to develop a new Riccati-based methodology for solving these learning problems that is amenable to continual learning applications. We also provide some corresponding numerical examples that demonstrate the potential computational and memory advantages our Riccati-based approach can provide.
Probabilistic Physics-integrated Neural Differentiable Modeling for Isothermal Chemical Vapor Infiltration Process
- Authors: Authors: Deepak Akhare, Zeping Chen, Richard Gulotty, Tengfei Luo, Jian-Xun Wang
- Subjects: Machine Learning (cs.LG)
- Arxiv link: https://arxiv.org/abs/2311.07798
- Pdf link: https://arxiv.org/pdf/2311.07798
- Abstract Chemical vapor infiltration (CVI) is a widely adopted manufacturing technique used in producing carbon-carbon and carbon-silicon carbide composites. These materials are especially valued in the aerospace and automotive industries for their robust strength and lightweight characteristics. The densification process during CVI critically influences the final performance, quality, and consistency of these composite materials. Experimentally optimizing the CVI processes is challenging due to long experimental time and large optimization space. To address these challenges, this work takes a modeling-centric approach. Due to the complexities and limited experimental data of the isothermal CVI densification process, we have developed a data-driven predictive model using the physics-integrated neural differentiable (PiNDiff) modeling framework. An uncertainty quantification feature has been embedded within the PiNDiff method, bolstering the model's reliability and robustness. Through comprehensive numerical experiments involving both synthetic and real-world manufacturing data, the proposed method showcases its capability in modeling densification during the CVI process. This research highlights the potential of the PiNDiff framework as an instrumental tool for advancing our understanding, simulation, and optimization of the CVI manufacturing process, particularly when faced with sparse data and an incomplete description of the underlying physics.
A Primal-Dual Analysis of Monotone Submodular Maximization
- Authors: Authors: Deeparnab Chakrabarty, Luc Cote
- Subjects: Data Structures and Algorithms (cs.DS)
- Arxiv link: https://arxiv.org/abs/2311.07808
- Pdf link: https://arxiv.org/pdf/2311.07808
- Abstract In this paper we design a new primal-dual algorithm for the classic discrete optimization problem of maximizing a monotone submodular function subject to a cardinality constraint achieving the optimal approximation of $(1-1/e)$. This problem and its special case, the maximum $k$-coverage problem, have a wide range of applications in various fields including operations research, machine learning, and economics. While greedy algorithms have been known to achieve this approximation factor, our algorithms also provide a dual certificate which upper bounds the optimum value of any instance. This certificate may be used in practice to certify much stronger guarantees than the worst-case $(1-1/e)$ approximation factor.
Statistical Parameterized Physics-Based Machine Learning Digital Twin Models for Laser Powder Bed Fusion Process
- Authors: Authors: Yangfan Li, Satyajit Mojumder, Ye Lu, Abdullah Al Amin, Jiachen Guo, Xiaoyu Xie, Wei Chen, Gregory J. Wagner, Jian Cao, Wing Kam Liu
- Subjects: Machine Learning (cs.LG); Computational Engineering, Finance, and Science (cs.CE); Numerical Analysis (math.NA); Data Analysis, Statistics and Probability (physics.data-an)
- Arxiv link: https://arxiv.org/abs/2311.07821
- Pdf link: https://arxiv.org/pdf/2311.07821
- Abstract A digital twin (DT) is a virtual representation of physical process, products and/or systems that requires a high-fidelity computational model for continuous update through the integration of sensor data and user input. In the context of laser powder bed fusion (LPBF) additive manufacturing, a digital twin of the manufacturing process can offer predictions for the produced parts, diagnostics for manufacturing defects, as well as control capabilities. This paper introduces a parameterized physics-based digital twin (PPB-DT) for the statistical predictions of LPBF metal additive manufacturing process. We accomplish this by creating a high-fidelity computational model that accurately represents the melt pool phenomena and subsequently calibrating and validating it through controlled experiments. In PPB-DT, a mechanistic reduced-order method-driven stochastic calibration process is introduced, which enables the statistical predictions of the melt pool geometries and the identification of defects such as lack-of-fusion porosity and surface roughness, specifically for diagnostic applications. Leveraging data derived from this physics-based model and experiments, we have trained a machine learning-based digital twin (PPB-ML-DT) model for predicting, monitoring, and controlling melt pool geometries. These proposed digital twin models can be employed for predictions, control, optimization, and quality assurance within the LPBF process, ultimately expediting product development and certification in LPBF-based metal additive manufacturing.
Adaptive Search Optimization: Dynamic Algorithm Selection and Caching for Enhanced Database Performance
- Authors: Authors: Hakikat Singh
- Subjects: Databases (cs.DB); Data Structures and Algorithms (cs.DS)
- Arxiv link: https://arxiv.org/abs/2311.07826
- Pdf link: https://arxiv.org/pdf/2311.07826
- Abstract Efficient search operations in databases are paramount for timely retrieval of information various applications. This research introduces a novel approach, combining dynamicalgorithm1 selection and caching2 strategies, to optimize search performance. The proposed dynamic search algorithm intelligently switches between Binary3 and Interpolation 4 Search based on dataset characteristics, significantly improving efficiency for non-uniformly distributed data. Additionally, a robust caching mechanism5 stores and retrieves previous search results, further enhancing computational efficiency6. Theoretical analysis and extensive experiments demonstrate the effectiveness of the approach, showcasing its potential to revolutionize database performance7 in scenarios with diverse data distributions. This research contributes valuable insights and practical solutions to the realm of database optimization, offering a promising avenue for enhancing search operations in real-world applications
AutoML for Large Capacity Modeling of Meta Ranking Systems
- Authors: Authors: Hang Yin, Kuang-Hung Liu, Mengying Sun, Yuxin Chen, Buyun Zhang, Jiang Liu, Vivek Sehgal, Rudresh Rajnikant Panchal, Eugen Hotaj, Xi Liu, Daifeng Guo, Jamey Zhang, Zhou Wang, Shali Jiang, Huayu Li, Zhengxing Chen, Wen-Yen Chen, Jiyan Yang, Wei Wen
- Subjects: Information Retrieval (cs.IR); Artificial Intelligence (cs.AI)
- Arxiv link: https://arxiv.org/abs/2311.07870
- Pdf link: https://arxiv.org/pdf/2311.07870
- Abstract Web-scale ranking systems at Meta serving billions of users is complex. Improving ranking models is essential but engineering heavy. Automated Machine Learning (AutoML) can release engineers from labor intensive work of tuning ranking models; however, it is unknown if AutoML is efficient enough to meet tight production timeline in real-world and, at the same time, bring additional improvements to the strong baselines. Moreover, to achieve higher ranking performance, there is an ever-increasing demand to scale up ranking models to even larger capacity, which imposes more challenges on the efficiency. The large scale of models and tight production schedule requires AutoML to outperform human baselines by only using a small number of model evaluation trials (around 100). We presents a sampling-based AutoML method, focusing on neural architecture search and hyperparameter optimization, addressing these challenges in Meta-scale production when building large capacity models. Our approach efficiently handles large-scale data demands. It leverages a lightweight predictor-based searcher and reinforcement learning to explore vast search spaces, significantly reducing the number of model evaluations. Through experiments in large capacity modeling for CTR and CVR applications, we show that our method achieves outstanding Return on Investment (ROI) versus human tuned baselines, with up to 0.09% Normalized Entropy (NE) loss reduction or $25%$ Query per Second (QPS) increase by only sampling one hundred models on average from a curated search space. The proposed AutoML method has already made real-world impact where a discovered Instagram CTR model with up to -0.36% NE gain (over existing production baseline) was selected for large-scale online A/B test and show statistically significant gain. These production results proved AutoML efficacy and accelerated its adoption in ranking systems at Meta.
Towards Transaction as a Service
- Authors: Authors: Yanfeng Zhang, Weixing Zhou, Yang Ren, Sihao Li, Guoliang Li, Ge Yu
- Subjects: Databases (cs.DB)
- Arxiv link: https://arxiv.org/abs/2311.07874
- Pdf link: https://arxiv.org/pdf/2311.07874
- Abstract This paper argues for decoupling transaction processing from existing two-layer cloud-native databases and making transaction processing as an independent service. By building a transaction as a service (TaaS) layer, the transaction processing can be independently scaled for high resource utilization and can be independently upgraded for development agility. Accordingly, we architect an execution-transaction-storage three-layer cloud-native database. By connecting to TaaS, 1) the AP engines can be empowered with ACID TP capability, 2) multiple standalone TP engine instances can be incorporated to support multi-master distributed TP for horizontal scalability, 3) multiple execution engines with different data models can be integrated to support multi-model transactions, and 4) high performance TP is achieved through extensive TaaS optimizations and consistent evolution. Cloud-native databases deserve better architecture: we believe that TaaS provides a path forward to better cloud-native databases.
Learning Adversarial Low-rank Markov Decision Processes with Unknown Transition and Full-information Feedback
- Authors: Authors: Canzhe Zhao, Ruofeng Yang, Baoxiang Wang, Xuezhou Zhang, Shuai Li
- Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Machine Learning (stat.ML)
- Arxiv link: https://arxiv.org/abs/2311.07876
- Pdf link: https://arxiv.org/pdf/2311.07876
- Abstract In this work, we study the low-rank MDPs with adversarially changed losses in the full-information feedback setting. In particular, the unknown transition probability kernel admits a low-rank matrix decomposition \citep{REPUCB22}, and the loss functions may change adversarially but are revealed to the learner at the end of each episode. We propose a policy optimization-based algorithm POLO, and we prove that it attains the $\widetilde{O}(K^{\frac{5}{6}}A^{\frac{1}{2}}d\ln(1+M)/(1-\gamma)^2)$ regret guarantee, where $d$ is rank of the transition kernel (and hence the dimension of the unknown representations), $A$ is the cardinality of the action space, $M$ is the cardinality of the model class, and $\gamma$ is the discounted factor. Notably, our algorithm is oracle-efficient and has a regret guarantee with no dependence on the size of potentially arbitrarily large state space. Furthermore, we also prove an $\Omega(\frac{\gamma^2}{1-\gamma} \sqrt{d A K})$ regret lower bound for this problem, showing that low-rank MDPs are statistically more difficult to learn than linear MDPs in the regret minimization setting. To the best of our knowledge, we present the first algorithm that interleaves representation learning, exploration, and exploitation to achieve the sublinear regret guarantee for RL with nonlinear function approximation and adversarial losses.
VegaEdge: Edge AI Confluence Anomaly Detection for Real-Time Highway IoT-Applications
- Authors: Authors: Vinit Katariya, Fatema-E- Jannat, Armin Danesh Pazho, Ghazal Alinezhad Noghre, Hamed Tabkhi
- Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Signal Processing (eess.SP)
- Arxiv link: https://arxiv.org/abs/2311.07880
- Pdf link: https://arxiv.org/pdf/2311.07880
- Abstract Vehicle anomaly detection plays a vital role in highway safety applications such as accident prevention, rapid response, traffic flow optimization, and work zone safety. With the surge of the Internet of Things (IoT) in recent years, there has arisen a pressing demand for Artificial Intelligence (AI) based anomaly detection methods designed to meet the requirements of IoT devices. Catering to this futuristic vision, we introduce a lightweight approach to vehicle anomaly detection by utilizing the power of trajectory prediction. Our proposed design identifies vehicles deviating from expected paths, indicating highway risks from different camera-viewing angles from real-world highway datasets. On top of that, we present VegaEdge - a sophisticated AI confluence designed for real-time security and surveillance applications in modern highway settings through edge-centric IoT-embedded platforms equipped with our anomaly detection approach. Extensive testing across multiple platforms and traffic scenarios showcases the versatility and effectiveness of VegaEdge. This work also presents the Carolinas Anomaly Dataset (CAD), to bridge the existing gap in datasets tailored for highway anomalies. In real-world scenarios, our anomaly detection approach achieves an AUC-ROC of 0.94, and our proposed VegaEdge design, on an embedded IoT platform, processes 738 trajectories per second in a typical highway setting. The dataset is available at https://github.com/TeCSAR-UNCC/Carolinas_Dataset#chd-anomaly-test-set .
Collaborative planning and optimization for electric-thermal-hydrogen-coupled energy systems with portfolio selection of the complete hydrogen energy chain
- Authors: Authors: Xinning Yi, Tianguang Lu, Yixiao Li, Qian Ai, Ran Hao
- Subjects: Systems and Control (eess.SY)
- Arxiv link: https://arxiv.org/abs/2311.07891
- Pdf link: https://arxiv.org/pdf/2311.07891
- Abstract Under the global low-carbon target, the uneven spatiotemporal distribution of renewable energy resources exacerbates the uncertainty and seasonal power imbalance. Additionally, the issue of an incomplete hydrogen energy chain is widely overlooked in planning models, which hinders the complete analysis of the role of hydrogen in energy systems. Therefore, this paper proposes a high-resolution collaborative planning model for electricity-thermal-hydrogen-coupled energy systems considering both the spatiotemporal distribution characteristics of renewable energy resources and the multi-scale bottom-to-top investment strategy for the complete hydrogen energy chain. Considering the high-resolution system operation flexibility, this paper proposes a hydrogen chain-based fast clustering optimization method that can handle high-dimensional data and multi-time scale operation characteristics. The model optimizes the geographical distribution and capacity configuration of the Northeast China energy system in 2050, with hourly operational characteristics. The planning optimization covered single-energy devices, multi-energy-coupled conversion devices, and electric-hydrogen transmission networks. Last but not least, this paper thoroughly examines the optimal portfolio selection of different hydrogen technologies based on the differences in cost, flexibility, and efficiency. In the Pareto analysis, the proposed model reduces CO2 emissions by 60% with a competitive cost. This paper provides a zero-carbon pathway for multi-energy systems with a cost 4% less than the social cost of carbon $44.6/ton, and the integration of the complete hydrogen energy chain reduces the renewable energy curtailment by 97.0%. Besides, the portfolio selection results indicate that the system favors the SOEC with the highest energy efficiency and the PEMFC with the fastest dynamic response when achieving zero-carbon emissions
CP-SLAM: Collaborative Neural Point-based SLAM System
- Authors: Authors: Jiarui Hu, Mao Mao, Hujun Bao, Guofeng Zhang, Zhaopeng Cui
- Subjects: Computer Vision and Pattern Recognition (cs.CV); Graphics (cs.GR); Robotics (cs.RO)
- Arxiv link: https://arxiv.org/abs/2311.08013
- Pdf link: https://arxiv.org/pdf/2311.08013
- Abstract This paper presents a collaborative implicit neural simultaneous localization and mapping (SLAM) system with RGB-D image sequences, which consists of complete front-end and back-end modules including odometry, loop detection, sub-map fusion, and global refinement. In order to enable all these modules in a unified framework, we propose a novel neural point based 3D scene representation in which each point maintains a learnable neural feature for scene encoding and is associated with a certain keyframe. Moreover, a distributed-to-centralized learning strategy is proposed for the collaborative implicit SLAM to improve consistency and cooperation. A novel global optimization framework is also proposed to improve the system accuracy like traditional bundle adjustment. Experiments on various datasets demonstrate the superiority of the proposed method in both camera tracking and mapping.
Two-Stage Predict+Optimize for Mixed Integer Linear Programs with Unknown Parameters in Constraints
- Authors: Authors: Xinyi Hu, Jasper C.H. Lee, Jimmy H.M. Lee
- Subjects: Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
- Arxiv link: https://arxiv.org/abs/2311.08022
- Pdf link: https://arxiv.org/pdf/2311.08022
- Abstract Consider the setting of constrained optimization, with some parameters unknown at solving time and requiring prediction from relevant features. Predict+Optimize is a recent framework for end-to-end training supervised learning models for such predictions, incorporating information about the optimization problem in the training process in order to yield better predictions in terms of the quality of the predicted solution under the true parameters. Almost all prior works have focused on the special case where the unknowns appear only in the optimization objective and not the constraints. Hu et al.~proposed the first adaptation of Predict+Optimize to handle unknowns appearing in constraints, but the framework has somewhat ad-hoc elements, and they provided a training algorithm only for covering and packing linear programs. In this work, we give a new \emph{simpler} and \emph{more powerful} framework called \emph{Two-Stage Predict+Optimize}, which we believe should be the canonical framework for the Predict+Optimize setting. We also give a training algorithm usable for all mixed integer linear programs, vastly generalizing the applicability of the framework. Experimental results demonstrate the superior prediction performance of our training framework over all classical and state-of-the-art methods.
Adversarial Preference Optimization
- Authors: Authors: Pengyu Cheng, Yifan Yang, Jian Li, Yong Dai, Nan Du
- Subjects: Computation and Language (cs.CL); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
- Arxiv link: https://arxiv.org/abs/2311.08045
- Pdf link: https://arxiv.org/pdf/2311.08045
- Abstract Human preference alignment is a crucial training step to improve the interaction quality of large language models (LLMs). Existing aligning methods depend on manually annotated preference data to guide the LLM optimization directions. However, in practice, continuously updating LLMs raises a distribution gap between model-generated samples and human-preferred responses, which hinders model fine-tuning efficiency. To mitigate this issue, previous methods require additional preference annotation on generated samples to adapt the shifted distribution, which consumes a large amount of annotation resources. Targeting more efficient human preference optimization, we propose an adversarial preference optimization (APO) framework, where the LLM agent and the preference model update alternatively via a min-max game. Without additional annotation, our APO method can make a self-adaption to the generation distribution gap through the adversarial learning process. In experiments, we empirically verify the effectiveness of APO in improving LLM's helpfulness and harmlessness compared with rejection sampling baselines.
Optimizing Electric Vehicle Efficiency with Real-Time Telemetry using Machine Learning
- Authors: Authors: Aryaman Rao, Harshit Gupta, Parth Singh, Shivam Mittal, Utkrash Singh, Dinesh Kumar Vishwakarma
- Subjects: Systems and Control (eess.SY)
- Arxiv link: https://arxiv.org/abs/2311.08085
- Pdf link: https://arxiv.org/pdf/2311.08085
- Abstract In the contemporary world with degrading natural resources, the urgency of energy efficiency has become imperative due to the conservation and environmental safeguarding. Therefore, it's crucial to look for advanced technology to minimize energy consumption. This research focuses on the optimization of battery-electric city style vehicles through the use of a real-time in-car telemetry system that communicates between components through the robust Controller Area Network (CAN) protocol. By harnessing real-time data from various sensors embedded within vehicles, our driving assistance system provides the driver with visual and haptic actionable feedback that guides the driver on using the optimum driving style to minimize power consumed by the vehicle. To develop the pace feedback mechanism for the driver, real-time data is collected through a Shell Eco Marathon Urban Concept vehicle platform and after pre-processing, it is analyzed using the novel machine learning algorithm TEMSL, that outperforms the existing baseline approaches across various performance metrics. This innovative method after numerous experimentation has proven effective in enhancing energy efficiency, guiding the driver along the track, and reducing human errors. The driving-assistance system offers a range of utilities, from cost savings and extended vehicle lifespan to significant contributions to environmental conservation and sustainable driving practices.
DiLoCo: Distributed Low-Communication Training of Language Models
- Authors: Authors: Arthur Douillard, Qixuan Feng, Andrei A. Rusu, Rachita Chhaparia, Yani Donchev, Adhiguna Kuncoro, Marc'Aurelio Ranzato, Arthur Szlam, Jiajun Shen
- Subjects: Machine Learning (cs.LG); Computation and Language (cs.CL)
- Arxiv link: https://arxiv.org/abs/2311.08105
- Pdf link: https://arxiv.org/pdf/2311.08105
- Abstract Large language models (LLM) have become a critical component in many applications of machine learning. However, standard approaches to training LLM require a large number of tightly interconnected accelerators, with devices exchanging gradients and other intermediate states at each optimization step. While it is difficult to build and maintain a single computing cluster hosting many accelerators, it might be easier to find several computing clusters each hosting a smaller number of devices. In this work, we propose a distributed optimization algorithm, Distributed Low-Communication (DiLoCo), that enables training of language models on islands of devices that are poorly connected. The approach is a variant of federated averaging, where the number of inner steps is large, the inner optimizer is AdamW, and the outer optimizer is Nesterov momentum. On the widely used C4 dataset, we show that DiLoCo on 8 workers performs as well as fully synchronous optimization while communicating 500 times less. DiLoCo exhibits great robustness to the data distribution of each worker. It is also robust to resources becoming unavailable over time, and vice versa, it can seamlessly leverage resources that become available during training.
Reconfigurable Intelligent Surface for Physical Layer Security in 6G-IoT: Designs, Issues, and Advances
- Authors: Authors: Waqas Khalid, M. Atif Ur Rehman, Trinh Van Chien, Zeeshan Kaleem, Howon Lee, Heejung Yu
- Subjects: Networking and Internet Architecture (cs.NI)
- Arxiv link: https://arxiv.org/abs/2311.08112
- Pdf link: https://arxiv.org/pdf/2311.08112
- Abstract Sixth-generation (6G) networks pose substantial security risks because confidential information is transmitted over wireless channels with a broadcast nature, and various attack vectors emerge. Physical layer security (PLS) exploits the dynamic characteristics of wireless environments to provide secure communications, while reconfigurable intelligent surfaces (RISs) can facilitate PLS by controlling wireless transmissions. With RIS-aided PLS, a lightweight security solution can be designed for low-end Internet of Things (IoT) devices, depending on the design scenario and communication objective. This article discusses RIS-aided PLS designs for 6G-IoT networks against eavesdropping and jamming attacks. The theoretical background and literature review of RIS-aided PLS are discussed, and design solutions related to resource allocation, beamforming, artificial noise, and cooperative communication are presented. We provide simulation results to show the effectiveness of RIS in terms of PLS. In addition, we examine the research issues and possible solutions for RIS modeling, channel modeling and estimation, optimization, and machine learning. Finally, we discuss recent advances, including STAR-RIS and malicious RIS.
Smart Skin separation control using distributed-input distributed-output, multi-modal actuators, and machine learning
- Authors: Authors: Songqi Li
- Subjects: Systems and Control (eess.SY); Fluid Dynamics (physics.flu-dyn)
- Arxiv link: https://arxiv.org/abs/2311.08116
- Pdf link: https://arxiv.org/pdf/2311.08116
- Abstract Efficient flow separation control represents significant economic benefit. This study applies a machine learning algorithm to minimize flow separation in Smart Skin, a flow control device that features distributed-input and distributed-output (DIDO). Smart Skin comprises 30 hybrid actuator units, each integrating a height-adjustable vortex generator and a mini-jet actuator. These units are deployed on a backward-facing ramp to reduce flow separation in a distributed manner. To monitor the flow state, distributed pressure taps are deployed around the multi-modal actuators. Parametric studies indicate that the mapping between control parameters and separation control performance is complex. To optimize separation control, a cutting-edge variant of the particle swarm optimization (PSO-TPME) is used for the control parameters in the Smart Skin. This algorithm is capable of achieving fast optimization in high-dimensional parameter spaces. The results demonstrate the efficiency of PSO-TPME, and the optimized solution significantly outperforms the best result from the parametric study. These findings represent a promising future of machine learning-based flow control using distributed actuators and sensors.
Ask One More Time: Self-Agreement Improves Reasoning of Language Models in (Almost) All Scenarios
- Authors: Authors: Lei Lin, Jiayi Fu, Pengli Liu, Junchen Wan, Fuzheng Zhang, Zhongyuan Wang, Di Zhang, Kun Gai
- Subjects: Computation and Language (cs.CL); Artificial Intelligence (cs.AI)
- Arxiv link: https://arxiv.org/abs/2311.08154
- Pdf link: https://arxiv.org/pdf/2311.08154
- Abstract Although chain-of-thought (CoT) prompting combined with language models has achieved encouraging results on complex reasoning tasks, the naive greedy decoding used in CoT prompting usually causes the repetitiveness and local optimality. To address this shortcoming, ensemble-optimization tries to obtain multiple reasoning paths to get the final answer assembly. However, current ensemble-optimization methods either simply employ rule-based post-processing such as \textit{self-consistency}, or train an additional model based on several task-related human annotations to select the best one among multiple reasoning paths, yet fail to generalize to realistic settings where the type of input questions is unknown or the answer format of reasoning paths is unknown. To avoid their limitations, we propose \textbf{self-agreement}, a generalizable ensemble-optimization method applying in almost all scenarios where the type of input questions and the answer format of reasoning paths may be known or unknown. Self-agreement firstly samples from language model's decoder to generate a \textit{diverse} set of reasoning paths, and subsequently prompts the language model \textit{one more time} to determine the optimal answer by selecting the most \textit{agreed} answer among the sampled reasoning paths. Self-agreement simultaneously achieves remarkable performance on six public reasoning benchmarks and superior generalization capabilities.
Channel Estimation with Dynamic Metasurface Antennas via Model-Based Learning
- Authors: Authors: Xiangyu Zhang, Haiyang Zhang, Luxi Yang, Yonina C.Eldar
- Subjects: Information Theory (cs.IT); Networking and Internet Architecture (cs.NI); Signal Processing (eess.SP)
- Arxiv link: https://arxiv.org/abs/2311.08158
- Pdf link: https://arxiv.org/pdf/2311.08158
- Abstract Dynamic Metasurface Antenna (DMA) is a cutting-edge antenna technology offering scalable and sustainable solutions for large antenna arrays. The effectiveness of DMAs stems from their inherent configurable analog signal processing capabilities, which facilitate cost-limited implementations. However, when DMAs are used in multiple input multiple output (MIMO) communication systems, they pose challenges in channel estimation due to their analog compression. In this paper, we propose two model-based learning methods to overcome this challenge. Our approach starts by casting channel estimation as a compressed sensing problem. Here, the sensing matrix is formed using a random DMA weighting matrix combined with a spatial gridding dictionary. We then employ the learned iterative shrinkage and thresholding algorithm (LISTA) to recover the sparse channel parameters. LISTA unfolds the iterative shrinkage and thresholding algorithm into a neural network and trains the neural network into a highly efficient channel estimator fitting with the previous channel. As the sensing matrix is crucial to the accuracy of LISTA recovery, we introduce another data-aided method, LISTA-sensing matrix optimization (LISTA-SMO), to jointly optimize the sensing matrix. LISTA-SMO takes LISTA as a backbone and embeds the sensing matrix optimization layers in LISTA's neural network, allowing for the optimization of the sensing matrix along with the training of LISTA. Furthermore, we propose a self-supervised learning technique to tackle the difficulty of acquiring noise-free data. Our numerical results demonstrate that LISTA outperforms traditional sparse recovery methods regarding channel estimation accuracy and efficiency. Besides, LISTA-SMO achieves better channel accuracy than LISTA, demonstrating the effectiveness in optimizing the sensing matrix.
Neural Lattice Reduction: A Self-Supervised Geometric Deep Learning Approach
- Authors: Authors: Giovanni Luca Marchetti, Gabriele Cesa, Kumar Pratik, Arash Behboodi
- Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Discrete Mathematics (cs.DM)
- Arxiv link: https://arxiv.org/abs/2311.08170
- Pdf link: https://arxiv.org/pdf/2311.08170
- Abstract Lattice reduction is a combinatorial optimization problem aimed at finding the most orthogonal basis in a given lattice. In this work, we address lattice reduction via deep learning methods. We design a deep neural model outputting factorized unimodular matrices and train it in a self-supervised manner by penalizing non-orthogonal lattice bases. We incorporate the symmetries of lattice reduction into the model by making it invariant and equivariant with respect to appropriate continuous and discrete groups.
Federated Skewed Label Learning with Logits Fusion
- Authors: Authors: Yuwei Wang, Runhan Li, Hao Tan, Xuefeng Jiang, Sheng Sun, Min Liu, Bo Gao, Zhiyuan Wu
- Subjects: Machine Learning (cs.LG)
- Arxiv link: https://arxiv.org/abs/2311.08202
- Pdf link: https://arxiv.org/pdf/2311.08202
- Abstract Federated learning (FL) aims to collaboratively train a shared model across multiple clients without transmitting their local data. Data heterogeneity is a critical challenge in realistic FL settings, as it causes significant performance deterioration due to discrepancies in optimization among local models. In this work, we focus on label distribution skew, a common scenario in data heterogeneity, where the data label categories are imbalanced on each client. To address this issue, we propose FedBalance, which corrects the optimization bias among local models by calibrating their logits. Specifically, we introduce an extra private weak learner on the client side, which forms an ensemble model with the local model. By fusing the logits of the two models, the private weak learner can capture the variance of different data, regardless of their category. Therefore, the optimization direction of local models can be improved by increasing the penalty for misclassifying minority classes and reducing the attention to majority classes, resulting in a better global model. Extensive experiments show that our method can gain 13% higher average accuracy compared with state-of-the-art methods.
A Wolf in Sheep's Clothing: Generalized Nested Jailbreak Prompts can Fool Large Language Models Easily
- Authors: Authors: Peng Ding, Jun Kuang, Dan Ma, Xuezhi Cao, Yunsen Xian, Jiajun Chen, Shujian Huang
- Subjects: Computation and Language (cs.CL)
- Arxiv link: https://arxiv.org/abs/2311.08268
- Pdf link: https://arxiv.org/pdf/2311.08268
- Abstract Large Language Models (LLMs), such as ChatGPT and GPT-4, are designed to provide useful and safe responses. However, adversarial prompts known as 'jailbreaks' can circumvent safeguards, leading LLMs to generate harmful content. Exploring jailbreak prompts can help to better reveal the weaknesses of LLMs and further steer us to secure them. Unfortunately, existing jailbreak methods either suffer from intricate manual design or require optimization on another white-box model, compromising generalization or jailbreak efficiency. In this paper, we generalize jailbreak prompt attacks into two aspects: (1) Prompt Rewriting and (2) Scenario Nesting. Based on this, we propose ReNeLLM, an automatic framework that leverages LLMs themselves to generate effective jailbreak prompts. Extensive experiments demonstrate that ReNeLLM significantly improves the attack success rate while greatly reducing the time cost compared to existing baselines. Our study also reveals the inadequacy of current defense methods in safeguarding LLMs. Finally, we offer detailed analysis and discussion from the perspective of prompt execution priority on the failure of LLMs' defense. We hope that our research can catalyze both the academic community and LLMs vendors towards the provision of safer and more regulated Large Language Models.
Optimally Managing the Impacts of Convergence Tolerance for Distributed Optimal Power Flow
- Authors: Authors: Rachel Harris, Mohannad Alkhraijah, Daniel K. Molzahn
- Subjects: Systems and Control (eess.SY)
- Arxiv link: https://arxiv.org/abs/2311.08305
- Pdf link: https://arxiv.org/pdf/2311.08305
- Abstract The future power grid may rely on distributed optimization to determine the set-points for huge numbers of distributed energy resources. There has been significant work on applying distributed algorithms to optimal power flow (OPF) problems, which require separate computing agents to agree on shared boundary variable values. Looser tolerances for the mismatches in these shared variables generally yield faster convergence at the expense of exacerbating constraint violations, but there is little quantitative understanding of how the convergence tolerance affects solution quality. To address this gap, we first quantify how convergence tolerance impacts constraint violations when the distributed OPF generator dispatch is applied to the power system. Using insights from this analysis, we then develop a bound tightening algorithm which guarantees that operating points from distributed OPF algorithms will not result in violations despite the possibility of shared variable mismatches within the convergence tolerance. We also explore how bounding the cumulative shared variable mismatches can prevent unnecessary conservativeness in the bound tightening. The proposed approach enables control of the trade-off between computational speed, which improves as the convergence tolerance increases, and distributed OPF solution cost, which increases with convergence tolerance due to tightened constraints, while ensuring feasibility.
GT4Py: High Performance Stencils for Weather and Climate Applications using Python
- Authors: Authors: Enrique G. Paredes, Linus Groner, Stefano Ubbiali, Hannes Vogt, Alberto Madonna, Kean Mariotti, Felipe Cruz, Lucas Benedicic, Mauro Bianco, Joost VandeVondele, Thomas C. Schulthess
- Subjects: Distributed, Parallel, and Cluster Computing (cs.DC); Programming Languages (cs.PL)
- Arxiv link: https://arxiv.org/abs/2311.08322
- Pdf link: https://arxiv.org/pdf/2311.08322
- Abstract All major weather and climate applications are currently developed using languages such as Fortran or C++. This is typical in the domain of high performance computing (HPC), where efficient execution is an important concern. Unfortunately, this approach leads to implementations that intermix optimizations for specific hardware architectures with the high-level numerical methods that are typical for the domain. This leads to code that is verbose, difficult to extend and maintain, and difficult to port to different hardware architectures. Here, we propose a different strategy based on GT4Py (GridTools for Python). GT4Py is a Python framework to write weather and climate applications that includes a high-level embedded domain specific language (DSL) to write stencil computations. The toolchain integrated in GT4Py enables automatic code-generation,to obtain the performance of state-of-the-art C++ and CUDA implementations. The separation of concerns between the mathematical definitions and the actual implementations allows for performance portability of the computations on a wide range of computing architectures, while being embedded in Python allows easy access to the tools of the Python ecosystem to enhance the productivity of the scientists and facilitate integration in complex workflows. Here, the initial release of GT4Py is described, providing an overview of the current state of the framework and performance results showing how GT4Py can outperform pure Python implementations by orders of magnitude.
Calibration of an Elastic Humanoid Upper Body and Efficient Compensation for Motion Planning
- Authors: Authors: Johannes Tenhumberg, Berthold Bäuml
- Subjects: Robotics (cs.RO)
- Arxiv link: https://arxiv.org/abs/2311.08333
- Pdf link: https://arxiv.org/pdf/2311.08333
- Abstract High absolute accuracy is an essential prerequisite for a humanoid robot to autonomously and robustly perform manipulation tasks while avoiding obstacles. We present for the first time a kinematic model for a humanoid upper body incorporating joint and transversal elasticities. These elasticities lead to significant deformations due to the robot's own weight, and the resulting model is implicitly defined via a torque equilibrium. We successfully calibrate this model for DLR's humanoid Agile Justin, including all Denavit-Hartenberg parameters and elasticities. The calibration is formulated as a combined least-squares problem with priors and based on measurements of the end effector positions of both arms via an external tracking system. The absolute position error is massively reduced from 21mm to 3.1mm on average in the whole workspace. Using this complex and implicit kinematic model in motion planning is challenging. We show that for optimization-based path planning, integrating the iterative solution of the implicit model into the optimization loop leads to an elegant and highly efficient solution. For mildly elastic robots like Agile Justin, there is no performance impact, and even for a simulated highly flexible robot with 20 times higher elasticities, the runtime increases by only 30%.
Sparse Linear Regression with Constraints: A Flexible Entropy-based Framework
- Authors: Authors: Amber Srivastava, Alisina Bayati, Srinivasa Salapaka
- Subjects: Systems and Control (eess.SY)
- Arxiv link: https://arxiv.org/abs/2311.08342
- Pdf link: https://arxiv.org/pdf/2311.08342
- Abstract This work presents a new approach to solve the sparse linear regression problem, i.e., to determine a k-sparse vector w in R^d that minimizes the cost ||y - Aw||^2_2. In contrast to the existing methods, our proposed approach splits this k-sparse vector into two parts -- (a) a column stochastic binary matrix V, and (b) a vector x in R^k. Here, the binary matrix V encodes the location of the k non-zero entries in w. Equivalently, it encodes the subset of k columns in the matrix A that map w to y. We demonstrate that this enables modeling several non-trivial application-specific structural constraints on w as constraints on V. The vector x comprises of the actual non-zero values in w. We use Maximum Entropy Principle (MEP) to solve the resulting optimization problem. In particular, we ascribe a probability distribution to the set of all feasible binary matrices V, and iteratively determine this distribution and the vector x such that the associated Shannon entropy gets minimized, and the regression cost attains a pre-specified value. The resulting algorithm employs homotopy from the convex entropy function to the non-convex cost function to avoid poor local minimum. We demonstrate the efficacy and flexibility of our proposed approach in incorporating a variety of practical constraints, that are otherwise difficult to model using the existing benchmark methods.
Speeding Up Optimization-based Motion Planning through Deep Learning
- Authors: Authors: Johannes Tenhumberg, Darius Burschka, Berthold Bäuml
- Subjects: Robotics (cs.RO)
- Arxiv link: https://arxiv.org/abs/2311.08345
- Pdf link: https://arxiv.org/pdf/2311.08345
- Abstract Planning collision-free motions for robots with many degrees of freedom is challenging in environments with complex obstacle geometries. Recent work introduced the idea of speeding up the planning by encoding prior experience of successful motion plans in a neural network. However, this "neural motion planning" did not scale to complex robots in unseen 3D environments as needed for real-world applications. Here, we introduce "basis point set", well-known in computer vision, to neural motion planning as a modern compact environment encoding enabling efficient supervised training networks that generalize well over diverse 3D worlds. Combined with a new elaborate training scheme, we reach a planning success rate of 100%. We use the network to predict an educated initial guess for an optimization-based planner (OMP), which quickly converges to a feasible solution, massively outperforming random multi-starts when tested on previously unseen environments. For the DLR humanoid Agile Justin with 19DoF and in challenging obstacle environments, optimal paths can be generated in 200ms using only a single CPU core. We also show a first successful real-world experiment based on a high-resolution world model from an integrated 3D sensor.
Hierarchical Experience-informed Navigation for Multi-modal Quadrupedal Rebar Grid Traversal
- Authors: Authors: Max Asselmeier, Jane Ivanova, Ziyi Zhou, Patricio A. Vela, Ye Zhao
- Subjects: Robotics (cs.RO)
- Arxiv link: https://arxiv.org/abs/2311.08354
- Pdf link: https://arxiv.org/pdf/2311.08354
- Abstract This study focuses on a layered, experience-based, multi-modal contact planning framework for agile quadrupedal locomotion over a constrained rebar environment. To this end, our hierarchical planner incorporates locomotion-specific modules into the high-level contact sequence planner and solves kinodynamically-aware trajectory optimization as the low-level motion planner. Through quantitative analysis of the experience accumulation process and experimental validation of the kinodynamic feasibility of the generated locomotion trajectories, we demonstrate that the experience planning heuristic offers an effective way of providing candidate footholds for a legged contact planner. Additionally, we introduce a guiding torso path heuristic at the global planning level to enhance the navigation success rate in the presence of environmental obstacles. Our results indicate that the torso-path guided experience accumulation requires significantly fewer offline trials to successfully reach the goal compared to regular experience accumulation. Finally, our planning framework is validated in both dynamics simulations and real hardware implementations on a quadrupedal robot provided by Skymul Inc.
Plum: Prompt Learning using Metaheuristic
- Authors: Authors: Rui Pan, Shuo Xing, Shizhe Diao, Xiang Liu, Kashun Shum, Jipeng Zhang, Tong Zhang
- Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Discrete Mathematics (cs.DM)
- Arxiv link: https://arxiv.org/abs/2311.08364
- Pdf link: https://arxiv.org/pdf/2311.08364
- Abstract Since the emergence of large language models, prompt learning has become a popular method for optimizing and customizing these models. Special prompts, such as Chain-of-Thought, have even revealed previously unknown reasoning capabilities within these models. However, the progress of discovering effective prompts has been slow, driving a desire for general prompt optimization methods. Unfortunately, few existing prompt learning methods satisfy the criteria of being truly "general", i.e., automatic, discrete, black-box, gradient-free, and interpretable all at once. In this paper, we introduce metaheuristics, a branch of discrete non-convex optimization methods with over 100 options, as a promising approach to prompt learning. Within our paradigm, we test six typical methods: hill climbing, simulated annealing, genetic algorithms with/without crossover, tabu search, and harmony search, demonstrating their effectiveness in black-box prompt learning and Chain-of-Thought prompt tuning. Furthermore, we show that these methods can be used to discover more human-understandable prompts that were previously unknown, opening the door to a cornucopia of possibilities in prompt optimization. We release all the codes in \url{https://github.com/research4pan/Plum}.
Direct Preference Optimization for Neural Machine Translation with Minimum Bayes Risk Decoding
- Authors: Authors: Guangyu Yang, Jinghong Chen, Weizhe Lin, Bill Byrne
- Subjects: Computation and Language (cs.CL)
- Arxiv link: https://arxiv.org/abs/2311.08380
- Pdf link: https://arxiv.org/pdf/2311.08380
- Abstract Minimum Bayes Risk (MBR) decoding can significantly improve translation performance of Multilingual Large Language Models (MLLMs). However, MBR decoding is computationally expensive and in this paper, we show how recently developed Reinforcement Learning (RL) technique, Direct Preference Optimization (DPO) can be used to fine-tune MLLMs so that we get the gains from MBR without the additional computation in inference. Our fine-tuned models have significantly improved performance on multiple NMT test sets compared to base MLLMs without preference optimization. Our method boosts the translation performance of MLLMs using relatively small monolingual fine-tuning sets.
Offline Data Enhanced On-Policy Policy Gradient with Provable Guarantees
- Authors: Authors: Yifei Zhou, Ayush Sekhari, Yuda Song, Wen Sun
- Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Machine Learning (stat.ML)
- Arxiv link: https://arxiv.org/abs/2311.08384
- Pdf link: https://arxiv.org/pdf/2311.08384
- Abstract Hybrid RL is the setting where an RL agent has access to both offline data and online data by interacting with the real-world environment. In this work, we propose a new hybrid RL algorithm that combines an on-policy actor-critic method with offline data. On-policy methods such as policy gradient and natural policy gradient (NPG) have shown to be more robust to model misspecification, though sometimes it may not be as sample efficient as methods that rely on off-policy learning. On the other hand, offline methods that depend on off-policy training often require strong assumptions in theory and are less stable to train in practice. Our new approach integrates a procedure of off-policy training on the offline data into an on-policy NPG framework. We show that our approach, in theory, can obtain a best-of-both-worlds type of result -- it achieves the state-of-art theoretical guarantees of offline RL when offline RL-specific assumptions hold, while at the same time maintaining the theoretical guarantees of on-policy NPG regardless of the offline RL assumptions' validity. Experimentally, in challenging rich-observation environments, we show that our approach outperforms a state-of-the-art hybrid RL baseline which only relies on off-policy policy optimization, demonstrating the empirical benefit of combining on-policy and off-policy learning. Our code is publicly available at https://github.com/YifeiZhou02/HNPG.
Fine-tuning Language Models for Factuality
- Authors: Authors: Katherine Tian, Eric Mitchell, Huaxiu Yao, Christopher D. Manning, Chelsea Finn
- Subjects: Computation and Language (cs.CL); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
- Arxiv link: https://arxiv.org/abs/2311.08401
- Pdf link: https://arxiv.org/pdf/2311.08401
- Abstract The fluency and creativity of large pre-trained language models (LLMs) have led to their widespread use, sometimes even as a replacement for traditional search engines. Yet language models are prone to making convincing but factually inaccurate claims, often referred to as 'hallucinations.' These errors can inadvertently spread misinformation or harmfully perpetuate misconceptions. Further, manual fact-checking of model responses is a time-consuming process, making human factuality labels expensive to acquire. In this work, we fine-tune language models to be more factual, without human labeling and targeting more open-ended generation settings than past work. We leverage two key recent innovations in NLP to do so. First, several recent works have proposed methods for judging the factuality of open-ended text by measuring consistency with an external knowledge base or simply a large model's confidence scores. Second, the direct preference optimization algorithm enables straightforward fine-tuning of language models on objectives other than supervised imitation, using a preference ranking over possible model responses. We show that learning from automatically generated factuality preference rankings, generated either through existing retrieval systems or our novel retrieval-free approach, significantly improves the factuality (percent of generated claims that are correct) of Llama-2 on held-out topics compared with RLHF or decoding strategies targeted at factuality. At 7B scale, compared to Llama-2-chat, we observe 58% and 40% reduction in factual error rate when generating biographies and answering medical questions, respectively.
Instant3D: Instant Text-to-3D Generation
- Authors: Authors: Ming Li, Pan Zhou, Jia-Wei Liu, Jussi Keppo, Min Lin, Shuicheng Yan, Xiangyu Xu
- Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Graphics (cs.GR); Machine Learning (cs.LG); Multimedia (cs.MM)
- Arxiv link: https://arxiv.org/abs/2311.08403
- Pdf link: https://arxiv.org/pdf/2311.08403
- Abstract Text-to-3D generation, which aims to synthesize vivid 3D objects from text prompts, has attracted much attention from the computer vision community. While several existing works have achieved impressive results for this task, they mainly rely on a time-consuming optimization paradigm. Specifically, these methods optimize a neural field from scratch for each text prompt, taking approximately one hour or more to generate one object. This heavy and repetitive training cost impedes their practical deployment. In this paper, we propose a novel framework for fast text-to-3D generation, dubbed Instant3D. Once trained, Instant3D is able to create a 3D object for an unseen text prompt in less than one second with a single run of a feedforward network. We achieve this remarkable speed by devising a new network that directly constructs a 3D triplane from a text prompt. The core innovation of our Instant3D lies in our exploration of strategies to effectively inject text conditions into the network. Furthermore, we propose a simple yet effective activation function, the scaled-sigmoid, to replace the original sigmoid function, which speeds up the training convergence by more than ten times. Finally, to address the Janus (multi-head) problem in 3D generation, we propose an adaptive Perp-Neg algorithm that can dynamically adjust its concept negation scales according to the severity of the Janus problem during training, effectively reducing the multi-head effect. Extensive experiments on a wide variety of benchmark datasets demonstrate that the proposed algorithm performs favorably against the state-of-the-art methods both qualitatively and quantitatively, while achieving significantly better efficiency. The project page is at https://ming1993li.github.io/Instant3DProj.
Keyword: adam
DiLoCo: Distributed Low-Communication Training of Language Models
- Authors: Authors: Arthur Douillard, Qixuan Feng, Andrei A. Rusu, Rachita Chhaparia, Yani Donchev, Adhiguna Kuncoro, Marc'Aurelio Ranzato, Arthur Szlam, Jiajun Shen
- Subjects: Machine Learning (cs.LG); Computation and Language (cs.CL)
- Arxiv link: https://arxiv.org/abs/2311.08105
- Pdf link: https://arxiv.org/pdf/2311.08105
- Abstract Large language models (LLM) have become a critical component in many applications of machine learning. However, standard approaches to training LLM require a large number of tightly interconnected accelerators, with devices exchanging gradients and other intermediate states at each optimization step. While it is difficult to build and maintain a single computing cluster hosting many accelerators, it might be easier to find several computing clusters each hosting a smaller number of devices. In this work, we propose a distributed optimization algorithm, Distributed Low-Communication (DiLoCo), that enables training of language models on islands of devices that are poorly connected. The approach is a variant of federated averaging, where the number of inner steps is large, the inner optimizer is AdamW, and the outer optimizer is Nesterov momentum. On the widely used C4 dataset, we show that DiLoCo on 8 workers performs as well as fully synchronous optimization while communicating 500 times less. DiLoCo exhibits great robustness to the data distribution of each worker. It is also robust to resources becoming unavailable over time, and vice versa, it can seamlessly leverage resources that become available during training.
Constant Query Local Decoding Against Deletions Is Impossible
- Authors: Authors: Meghal Gupta
- Subjects: Information Theory (cs.IT); Data Structures and Algorithms (cs.DS)
- Arxiv link: https://arxiv.org/abs/2311.08399
- Pdf link: https://arxiv.org/pdf/2311.08399
- Abstract Locally decodable codes (LDC's) are error-correcting codes that allow recovery of individual message indices by accessing only a constant number of codeword indices. For substitution errors, it is evident that LDC's exist -- Hadamard codes are examples of $2$-query LDC's. Research on this front has focused on finding the optimal encoding length for LDC's, for which there is a nearly exponential gap between the best lower bounds and constructions. Ostrovsky and Paskin-Cherniavsky (ICITS 2015) introduced the notion of local decoding to the insertion and deletion setting. In this context, it is not clear whether constant query LDC's exist at all. Indeed, in contrast to the classical setting, Block et al. conjecture that they do not exist. Blocki et al. (FOCS 2021) make progress towards this conjecture, proving that any potential code must have at least exponential encoding length. Our work definitively resolves the conjecture and shows that constant query LDC's do not exist in the insertion/deletion (or even deletion-only) setting. Using a reduction shown by Blocki et al., this also implies that constant query locally correctable codes do not exist in this setting.
Keyword: gradient
Finetuning Text-to-Image Diffusion Models for Fairness
- Authors: Authors: Xudong Shen, Chao Du, Tianyu Pang, Min Lin, Yongkang Wong, Mohan Kankanhalli
- Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV); Computers and Society (cs.CY)
- Arxiv link: https://arxiv.org/abs/2311.07604
- Pdf link: https://arxiv.org/pdf/2311.07604
- Abstract The rapid adoption of text-to-image diffusion models in society underscores an urgent need to address their biases. Without interventions, these biases could propagate a distorted worldview and limit opportunities for minority groups. In this work, we frame fairness as a distributional alignment problem. Our solution consists of two main technical contributions: (1) a distributional alignment loss that steers specific characteristics of the generated images towards a user-defined target distribution, and (2) biased direct finetuning of diffusion model's sampling process, which leverages a biased gradient to more effectively optimize losses defined on the generated images. Empirically, our method markedly reduces gender, racial, and their intersectional biases for occupational prompts. Gender bias is significantly reduced even when finetuning just five soft tokens. Crucially, our method supports diverse perspectives of fairness beyond absolute equality, which is demonstrated by controlling age to a $75%$ young and $25%$ old distribution while simultaneously debiasing gender and race. Finally, our method is scalable: it can debias multiple concepts at once by simply including these prompts in the finetuning data. We hope our work facilitates the social alignment of T2I generative AI. We will share code and various debiased diffusion model adaptors.
In-context Learning and Gradient Descent Revisited
- Authors: Authors: Tomer Bar Nathan, Gilad Deutch, Nadav Magar, Guy Dar
- Subjects: Computation and Language (cs.CL); Machine Learning (cs.LG)
- Arxiv link: https://arxiv.org/abs/2311.07772
- Pdf link: https://arxiv.org/pdf/2311.07772
- Abstract In-context learning (ICL) has shown impressive results in few-shot learning tasks, yet its underlying mechanism is still not fully understood. Recent works suggest that ICL can be thought of as a gradient descent (GD) based optimization process. While promising, these results mainly focus on simplified settings of ICL and provide only a preliminary evaluation of the similarities between the two methods. In this work, we revisit the comparison between ICL and GD-based finetuning and study what properties of ICL an equivalent process must follow. We highlight a major difference in the flow of information between ICL and standard finetuning. Namely, ICL can only rely on information from lower layers at every point, while finetuning depends on loss gradients from deeper layers. We refer to this discrepancy as Layer Causality and show that a layer causal variant of the finetuning process aligns with ICL on par with vanilla finetuning and is even better in most cases across relevant metrics. To the best of our knowledge, this is the first work to discuss this discrepancy explicitly and suggest a solution that tackles this problem with minimal changes.
DiLoCo: Distributed Low-Communication Training of Language Models
- Authors: Authors: Arthur Douillard, Qixuan Feng, Andrei A. Rusu, Rachita Chhaparia, Yani Donchev, Adhiguna Kuncoro, Marc'Aurelio Ranzato, Arthur Szlam, Jiajun Shen
- Subjects: Machine Learning (cs.LG); Computation and Language (cs.CL)
- Arxiv link: https://arxiv.org/abs/2311.08105
- Pdf link: https://arxiv.org/pdf/2311.08105
- Abstract Large language models (LLM) have become a critical component in many applications of machine learning. However, standard approaches to training LLM require a large number of tightly interconnected accelerators, with devices exchanging gradients and other intermediate states at each optimization step. While it is difficult to build and maintain a single computing cluster hosting many accelerators, it might be easier to find several computing clusters each hosting a smaller number of devices. In this work, we propose a distributed optimization algorithm, Distributed Low-Communication (DiLoCo), that enables training of language models on islands of devices that are poorly connected. The approach is a variant of federated averaging, where the number of inner steps is large, the inner optimizer is AdamW, and the outer optimizer is Nesterov momentum. On the widely used C4 dataset, we show that DiLoCo on 8 workers performs as well as fully synchronous optimization while communicating 500 times less. DiLoCo exhibits great robustness to the data distribution of each worker. It is also robust to resources becoming unavailable over time, and vice versa, it can seamlessly leverage resources that become available during training.
Carpe Diem: On the Evaluation of World Knowledge in Lifelong Language Models
- Authors: Authors: Yujin Kim, Jaehong Yoon, Seonghyeon Ye, Sung Ju Hwang, Se-young Yun
- Subjects: Computation and Language (cs.CL)
- Arxiv link: https://arxiv.org/abs/2311.08106
- Pdf link: https://arxiv.org/pdf/2311.08106
- Abstract In an ever-evolving world, the dynamic nature of knowledge presents challenges for language models that are trained on static data, leading to outdated encoded information. However, real-world scenarios require models not only to acquire new knowledge but also to overwrite outdated information into updated ones. To address this under-explored issue, we introduce the temporally evolving question answering benchmark, EvolvingQA - a novel benchmark designed for training and evaluating LMs on an evolving Wikipedia database, where the construction of our benchmark is automated with our pipeline using large language models. Our benchmark incorporates question-answering as a downstream task to emulate real-world applications. Through EvolvingQA, we uncover that existing continual learning baselines have difficulty in updating and forgetting outdated knowledge. Our findings suggest that the models fail to learn updated knowledge due to the small weight gradient. Furthermore, we elucidate that the models struggle mostly on providing numerical or temporal answers to questions asking for updated knowledge. Our work aims to model the dynamic nature of real-world information, offering a robust measure for the evolution-adaptability of language models.
Evaluating Neighbor Explainability for Graph Neural Networks
- Authors: Authors: Oscar Llorente, Péter Vaderna, Sándor Laki, Roland Kotroczó, Rita Csoma, János Márk Szalai-Gindl
- Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI)
- Arxiv link: https://arxiv.org/abs/2311.08118
- Pdf link: https://arxiv.org/pdf/2311.08118
- Abstract Explainability in Graph Neural Networks (GNNs) is a new field growing in the last few years. In this publication we address the problem of determining how important is each neighbor for the GNN when classifying a node and how to measure the performance for this specific task. To do this, various known explainability methods are reformulated to get the neighbor importance and four new metrics are presented. Our results show that there is almost no difference between the explanations provided by gradient-based techniques in the GNN domain. In addition, many explainability techniques failed to identify important neighbors when GNNs without self-loops are used.
Computational homogenization of higher-order electro-mechanical materials with built-in generalized periodicity conditions
- Authors: Authors: J. Barceló-Mercader, D. Codony, A. Mocci, I. Arias
- Subjects: Numerical Analysis (math.NA); Computational Engineering, Finance, and Science (cs.CE)
- Arxiv link: https://arxiv.org/abs/2311.08196
- Pdf link: https://arxiv.org/pdf/2311.08196
- Abstract We present a formulation for high-order generalized periodicity conditions in the context of a high-order electromechanical theory including flexoelectricity, strain gradient elasticity and gradient dielectricity, with the goal of studying periodic architected metamaterials. Such theory results in fourth-order governing partial differential equations, and the periodicity conditions involve continuity across the periodic boundary of primal fields (displacement and electric potential) and their normal derivatives, continuity of the corresponding dual generalized forces (tractions, double tractions, surface charge density and double surface charge density). Rather than imposing these conditions numerically as explicit constraints, we develop an approximation space which fulfils generalized periodicity by construction. Our method naturally allows us to impose general macroscopic fields (strains/stresses and electric fields/electric displacements) along arbitrary directions, enabling the characterization of the material anisotropy. We apply the proposed method to study periodic architected metamaterials with apparent piezoelectricity. We first verify the method by directly comparing the results with a large periodic structure, then apply it to evaluate the anisotropic apparently piezoelectricity of a geometrically polarized 2D lattice, and finally demonstrate the application of the method in a 3D architected metamaterial.
On-Policy Policy Gradient Reinforcement Learning Without On-Policy Sampling
- Authors: Authors: Nicholas E. Corrado, Josiah P. Hanna
- Subjects: Machine Learning (cs.LG)
- Arxiv link: https://arxiv.org/abs/2311.08290
- Pdf link: https://arxiv.org/pdf/2311.08290
- Abstract On-policy reinforcement learning (RL) algorithms perform policy updates using i.i.d. trajectories collected by the current policy. However, after observing only a finite number of trajectories, on-policy sampling may produce data that fails to match the expected on-policy data distribution. This sampling error leads to noisy updates and data inefficient on-policy learning. Recent work in the policy evaluation setting has shown that non-i.i.d., off-policy sampling can produce data with lower sampling error than on-policy sampling can produce. Motivated by this observation, we introduce an adaptive, off-policy sampling method to improve the data efficiency of on-policy policy gradient algorithms. Our method, Proximal Robust On-Policy Sampling (PROPS), reduces sampling error by collecting data with a behavior policy that increases the probability of sampling actions that are under-sampled with respect to the current policy. Rather than discarding data from old policies -- as is commonly done in on-policy algorithms -- PROPS uses data collection to adjust the distribution of previously collected data to be approximately on-policy. We empirically evaluate PROPS on both continuous-action MuJoCo benchmark tasks as well as discrete-action tasks and demonstrate that (1) PROPS decreases sampling error throughout training and (2) improves the data efficiency of on-policy policy gradient algorithms. Our work improves the RL community's understanding of a nuance in the on-policy vs off-policy dichotomy: on-policy learning requires on-policy data, not on-policy sampling.
Inverse Learning with Extremely Sparse Feedback for Recommendation
- Authors: Authors: Guanyu Lin, Chen Gao, Yu Zheng, Yinfeng Li, Jianxin Chang, Yanan Niu, Yang Song, Kun Gai, Zhiheng Li, Depeng Jin, Yong Li
- Subjects: Information Retrieval (cs.IR)
- Arxiv link: https://arxiv.org/abs/2311.08302
- Pdf link: https://arxiv.org/pdf/2311.08302
- Abstract Modern personalized recommendation services often rely on user feedback, either explicit or implicit, to improve the quality of services. Explicit feedback refers to behaviors like ratings, while implicit feedback refers to behaviors like user clicks. However, in the scenario of full-screen video viewing experiences like Tiktok and Reels, the click action is absent, resulting in unclear feedback from users, hence introducing noises in modeling training. Existing approaches on de-noising recommendation mainly focus on positive instances while ignoring the noise in a large amount of sampled negative feedback. In this paper, we propose a meta-learning method to annotate the unlabeled data from loss and gradient perspectives, which considers the noises in both positive and negative instances. Specifically, we first propose an Inverse Dual Loss (IDL) to boost the true label learning and prevent the false label learning. Then we further propose an Inverse Gradient (IG) method to explore the correct updating gradient and adjust the updating based on meta-learning. Finally, we conduct extensive experiments on both benchmark and industrial datasets where our proposed method can significantly improve AUC by 9.25% against state-of-the-art methods. Further analysis verifies the proposed inverse learning framework is model-agnostic and can improve a variety of recommendation backbones. The source code, along with the best hyper-parameter settings, is available at this link: https://github.com/Guanyu-Lin/InverseLearning.
Sparsity-Preserving Differentially Private Training of Large Embedding Models
- Authors: Authors: Badih Ghazi, Yangsibo Huang, Pritish Kamath, Ravi Kumar, Pasin Manurangsi, Amer Sinha, Chiyuan Zhang
- Subjects: Machine Learning (cs.LG); Cryptography and Security (cs.CR)
- Arxiv link: https://arxiv.org/abs/2311.08357
- Pdf link: https://arxiv.org/pdf/2311.08357
- Abstract As the use of large embedding models in recommendation systems and language applications increases, concerns over user data privacy have also risen. DP-SGD, a training algorithm that combines differential privacy with stochastic gradient descent, has been the workhorse in protecting user privacy without compromising model accuracy by much. However, applying DP-SGD naively to embedding models can destroy gradient sparsity, leading to reduced training efficiency. To address this issue, we present two new algorithms, DP-FEST and DP-AdaFEST, that preserve gradient sparsity during private training of large embedding models. Our algorithms achieve substantial reductions ($10^6 \times$) in gradient size, while maintaining comparable levels of accuracy, on benchmark real-world datasets.
Plum: Prompt Learning using Metaheuristic
- Authors: Authors: Rui Pan, Shuo Xing, Shizhe Diao, Xiang Liu, Kashun Shum, Jipeng Zhang, Tong Zhang
- Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Discrete Mathematics (cs.DM)
- Arxiv link: https://arxiv.org/abs/2311.08364
- Pdf link: https://arxiv.org/pdf/2311.08364
- Abstract Since the emergence of large language models, prompt learning has become a popular method for optimizing and customizing these models. Special prompts, such as Chain-of-Thought, have even revealed previously unknown reasoning capabilities within these models. However, the progress of discovering effective prompts has been slow, driving a desire for general prompt optimization methods. Unfortunately, few existing prompt learning methods satisfy the criteria of being truly "general", i.e., automatic, discrete, black-box, gradient-free, and interpretable all at once. In this paper, we introduce metaheuristics, a branch of discrete non-convex optimization methods with over 100 options, as a promising approach to prompt learning. Within our paradigm, we test six typical methods: hill climbing, simulated annealing, genetic algorithms with/without crossover, tabu search, and harmony search, demonstrating their effectiveness in black-box prompt learning and Chain-of-Thought prompt tuning. Furthermore, we show that these methods can be used to discover more human-understandable prompts that were previously unknown, opening the door to a cornucopia of possibilities in prompt optimization. We release all the codes in \url{https://github.com/research4pan/Plum}.
Offline Data Enhanced On-Policy Policy Gradient with Provable Guarantees
- Authors: Authors: Yifei Zhou, Ayush Sekhari, Yuda Song, Wen Sun
- Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Machine Learning (stat.ML)
- Arxiv link: https://arxiv.org/abs/2311.08384
- Pdf link: https://arxiv.org/pdf/2311.08384
- Abstract Hybrid RL is the setting where an RL agent has access to both offline data and online data by interacting with the real-world environment. In this work, we propose a new hybrid RL algorithm that combines an on-policy actor-critic method with offline data. On-policy methods such as policy gradient and natural policy gradient (NPG) have shown to be more robust to model misspecification, though sometimes it may not be as sample efficient as methods that rely on off-policy learning. On the other hand, offline methods that depend on off-policy training often require strong assumptions in theory and are less stable to train in practice. Our new approach integrates a procedure of off-policy training on the offline data into an on-policy NPG framework. We show that our approach, in theory, can obtain a best-of-both-worlds type of result -- it achieves the state-of-art theoretical guarantees of offline RL when offline RL-specific assumptions hold, while at the same time maintaining the theoretical guarantees of on-policy NPG regardless of the offline RL assumptions' validity. Experimentally, in challenging rich-observation environments, we show that our approach outperforms a state-of-the-art hybrid RL baseline which only relies on off-policy policy optimization, demonstrating the empirical benefit of combining on-policy and off-policy learning. Our code is publicly available at https://github.com/YifeiZhou02/HNPG.
Keyword: super-resolution
Learning based Deep Disentangling Light Field Reconstruction and Disparity Estimation Application
- Authors: Authors: Langqing Shi, Ping Zhou
- Subjects: Computer Vision and Pattern Recognition (cs.CV)
- Arxiv link: https://arxiv.org/abs/2311.08129
- Pdf link: https://arxiv.org/pdf/2311.08129
- Abstract Light field cameras have a wide range of uses due to their ability to simultaneously record light intensity and direction. The angular resolution of light fields is important for downstream tasks such as depth estimation, yet is often difficult to improve due to hardware limitations. Conventional methods tend to perform poorly against the challenge of large disparity in sparse light fields, while general CNNs have difficulty extracting spatial and angular features coupled together in 4D light fields. The light field disentangling mechanism transforms the 4D light field into 2D image format, which is more favorable for CNN for feature extraction. In this paper, we propose a Deep Disentangling Mechanism, which inherits the principle of the light field disentangling mechanism and further develops the design of the feature extractor and adds advanced network structure. We design a light-field reconstruction network (i.e., DDASR) on the basis of the Deep Disentangling Mechanism, and achieve SOTA performance in the experiments. In addition, we design a Block Traversal Angular Super-Resolution Strategy for the practical application of depth estimation enhancement where the input views is often higher than 2x2 in the experiments resulting in a high memory usage, which can reduce the memory usage while having a better reconstruction performance.