NLPModelsIpopt.jl icon indicating copy to clipboard operation
NLPModelsIpopt.jl copied to clipboard

`SystemError: Too many open files` when running `ipopt` in a loop

Open ForceBru opened this issue 1 year ago • 1 comments

TL;DR: I get the following error when I run ipopt multiple times in a loop:

ERROR: SystemError: opening file "/var/folders/ys/3h0gnqns4b98zb66_vl_m35m0000gn/T/jl_27Ed4DC3G3": Too many open files

According to the error message, this line is the problem:

https://github.com/JuliaSmoothOptimizers/NLPModelsIpopt.jl/blob/1a16af5178bff0458e3a00b61f93675d968a2a86/src/NLPModelsIpopt.jl#L242

I feel like ipopt_log_file is never closed because I don't see where it's closed or deleted in the code.

Code

### A Pluto.jl notebook ###
# v0.19.22

using Markdown
using InteractiveUtils

# ╔═╡ 19c05094-a7d7-11ed-3f09-bdd2c5211893
import Pkg; Pkg.activate(temp=true); Pkg.add([
	Pkg.PackageSpec(name="ADNLPModels", version="0.5.1"),
	Pkg.PackageSpec(name="NLPModelsIpopt", version="0.10.0"),
], io=devnull); Pkg.status()

# ╔═╡ 542a1e70-1143-4491-9d96-74b90974f5a4
import Random; using Statistics, SparseArrays

# ╔═╡ f040621c-d861-489a-9823-d0ad3f129094
import ADNLPModels, NLPModelsIpopt

# ╔═╡ 25c475f6-1cf6-47ef-ac95-f00390b503e1
npdf(x, m, v) = exp(-(x - m)^2 / (2v)) / sqrt(2π * v);

# ╔═╡ 82bd7b5c-42c8-487f-8488-05d6113d10c7
function objective(params, bb, data)
	a, b, c = params[1:2], params[3:4], params[5:6]

	-2mean(
		sum(
			p * npdf(x, m, v + bb)
			for (p, m, v) in zip(a, b, c)
		)
		for x in data
	) + sum(
		a[j] * a[k] * npdf(b[j], b[k], c[j] + c[k])
		for j in eachindex(a), k in eachindex(a)
	)
end;

# ╔═╡ 8ad0f488-5043-4724-8c55-a5b866cc3652
function fit(fn::Function, x0, bb, data)
	model = ADNLPModels.ADNLPModel(
		par -> fn(par, bb, data),
		x0, [0, 0, -Inf, -Inf, 0, 0], [1, 1, Inf, Inf, Inf, Inf],
		sparse([1. 1 0 0 0 0]) |> dropzeros, [1.], [1.]
	)
	NLPModelsIpopt.ipopt(model, print_level=0).solution
end;

# ╔═╡ ec57a908-ed39-40a7-8856-624fd72c8f4c
random_data(rng) = [randn(rng, 130) .+ 1; 0.5randn(rng, 200) .- 1];

# ╔═╡ 5c4cbfb2-195c-462d-be5a-8cf8d31777fd
data = random_data(Random.MersenneTwister(10));

# ╔═╡ 591afe94-ff80-4a11-bd30-2ac6f8d118e7
ans = fit(objective, [0.45, 0.55, 0, 0, 1, 2], 0.1, data)

# ╔═╡ 95e430ff-03ba-4443-80b0-2f5f0ab4202d
w = let nrep = 200
	x0 = [0.45, 0.55, 0, 0, 1, 2]
	bb = 0.1
	w = zeros(nrep)
	for i in 1:nrep
		@info i
		data = random_data(Random.MersenneTwister(i))
		par = fit(objective, x0, bb, data)

		w[i] = mean(data)
	end
	w
end

# ╔═╡ Cell order:
# ╠═19c05094-a7d7-11ed-3f09-bdd2c5211893
# ╠═542a1e70-1143-4491-9d96-74b90974f5a4
# ╠═f040621c-d861-489a-9823-d0ad3f129094
# ╠═25c475f6-1cf6-47ef-ac95-f00390b503e1
# ╠═82bd7b5c-42c8-487f-8488-05d6113d10c7
# ╠═8ad0f488-5043-4724-8c55-a5b866cc3652
# ╠═ec57a908-ed39-40a7-8856-624fd72c8f4c
# ╠═5c4cbfb2-195c-462d-be5a-8cf8d31777fd
# ╠═591afe94-ff80-4a11-bd30-2ac6f8d118e7
# ╠═95e430ff-03ba-4443-80b0-2f5f0ab4202d

This is a Pluto notebook, but it can be run like a regular Julia script. The below error happens when this code is run from Pluto too.

Error

$ julia-1.8 -i notebook.jl
  Activating new project at `/var/folders/ys/3h0gnqns4b98zb66_vl_m35m0000gn/T/jl_Hhyb1u`
Status `/private/var/folders/ys/3h0gnqns4b98zb66_vl_m35m0000gn/T/jl_Hhyb1u/Project.toml`
  [54578032] ADNLPModels v0.5.1
  [f4238b75] NLPModelsIpopt v0.10.0

******************************************************************************
This program contains Ipopt, a library for large-scale nonlinear optimization.
 Ipopt is released as open source code under the Eclipse Public License (EPL).
         For more information visit https://github.com/coin-or/Ipopt
******************************************************************************

[ Info: 1
...
[ Info: 200
               _
   _       _ _(_)_     |  Documentation: https://docs.julialang.org
  (_)     | (_) (_)    |
   _ _   _| |_  __ _   |  Type "?" for help, "]?" for Pkg help.
  | | | | | | |/ _` |  |
  | | |_| | | | (_| |  |  Version 1.8.5 (2023-01-08)
 _/ |\__'_|_|_|\__'_|  |  Official https://julialang.org/ release
|__/                   |

julia> w = let nrep = 200
               x0 = [0.45, 0.55, 0, 0, 1, 2]
               bb = 0.1
               w = zeros(nrep)
               for i in 1:nrep
                       @info i
                       data = random_data(Random.MersenneTwister(i))
                       par = fit(objective, x0, bb, data)

                       w[i] = mean(data)
               end
               w
       end
[ Info: 1
...
[ Info: 200
200-element Vector{Float64}:
 -0.2779192839748305
...
 -0.17939514739188267

julia> w = let nrep = 200
               x0 = [0.45, 0.55, 0, 0, 1, 2]
               bb = 0.1
               w = zeros(nrep)
               for i in 1:nrep
                       @info i
                       data = random_data(Random.MersenneTwister(i))
                       par = fit(objective, x0, bb, data)

                       w[i] = mean(data)
               end
               w
       end
[ Info: 1
...
[ Info: 200
200-element Vector{Float64}:
 -0.2779192839748305
...
 -0.17939514739188267

julia> w = let nrep = 200
               x0 = [0.45, 0.55, 0, 0, 1, 2]
               bb = 0.1
               w = zeros(nrep)
               for i in 1:nrep
                       @info i
                       data = random_data(Random.MersenneTwister(i))
                       par = fit(objective, x0, bb, data)

                       w[i] = mean(data)
               end
               w
       end
[ Info: 1
[ Info: 2
...
[ Info: 118
[ Info: 119
ERROR: SystemError: opening file "/var/folders/ys/3h0gnqns4b98zb66_vl_m35m0000gn/T/jl_27Ed4DC3G3": Too many open files
Stacktrace:
  [1] systemerror(p::String, errno::Int32; extrainfo::Nothing)
    @ Base ~/Desktop/Julia/Julia-1.8.app/Contents/Resources/julia/lib/julia/sys.dylib:-1
  [2] (::Base.var"#systemerror##kw")(::NamedTuple{(:extrainfo,), Tuple{Nothing}}, ::typeof(systemerror), p::String, errno::Int32)
    @ Base ~/Desktop/Julia/Julia-1.8.app/Contents/Resources/julia/lib/julia/sys.dylib:-1
  [3] (::Base.var"#systemerror##kw")(::NamedTuple{(:extrainfo,), Tuple{Nothing}}, ::typeof(systemerror), p::String)
    @ Base ~/Desktop/Julia/Julia-1.8.app/Contents/Resources/julia/lib/julia/sys.dylib:-1
  [4] open(fname::String; lock::Bool, read::Nothing, write::Nothing, create::Nothing, truncate::Nothing, append::Nothing)
    @ Base ~/Desktop/Julia/Julia-1.8.app/Contents/Resources/julia/lib/julia/sys.dylib:-1
  [5] open
    @ ./iostream.jl:275 [inlined]
  [6] open(f::Base.var"#399#400"{Base.Pairs{Symbol, Union{}, Tuple{}, NamedTuple{(), Tuple{}}}}, args::String; kwargs::Base.Pairs{Symbol, Union{}, Tuple{}, NamedTuple{(), Tuple{}}})
    @ Base ./io.jl:382
  [7] open
    @ ./io.jl:381 [inlined]
  [8] #readlines#398
    @ ./io.jl:584 [inlined]
  [9] readlines
    @ ./io.jl:583 [inlined]
 [10] solve!(solver::NLPModelsIpopt.IpoptSolver, nlp::ADNLPModels.ADNLPModel{Float64, Vector{Float64}, Vector{Int64}}, stats::SolverCore.GenericExecutionStats{Float64, Vector{Float64}, Vector{Float64}, Any}; callback::Function, kwargs::Base.Pairs{Symbol, Int64, Tuple{Symbol}, NamedTuple{(:print_level,), Tuple{Int64}}})
    @ NLPModelsIpopt ~/.julia/packages/NLPModelsIpopt/X7HD4/src/NLPModelsIpopt.jl:242
 [11] #ipopt#6
    @ ~/.julia/packages/NLPModelsIpopt/X7HD4/src/NLPModelsIpopt.jl:158 [inlined]
 [12] fit(fn::typeof(objective), x0::Vector{Float64}, bb::Float64, data::Vector{Float64})
    @ Main ~/Desktop/Julia/test/bug/notebook.jl:45
 [13] top-level scope
    @ ./REPL[1]:8

julia>

Steps to reproduce

  1. julia-1.8 -i notebook.jl to run the full code
  2. Try running the last expression (w = let nrep = 200 ...) manually several times in quick succession.
  3. The first couple of runs succeed.
  4. Subsequent runs fails with the error above.
  5. I waited maybe a couple of minutes and executed the same expression in the same Julia REPL again, and it worked. Presumably, macOS closed some unused open files automatically?
  6. Subsequent runs fail anyway.

Running lsof shows that the julia process has a lot of open files

Before running the w = let ... code (Julia REPL is running):

$ lsof -n +c 0 | sed -E 's/^([^ ]+[ ]+[^ ]+).*$/\1/' | uniq -c | sort | tail
 153 corespotlightd                      563
 159 Safari                             1522
 161 com.apple.WebKit.WebContent        1535
 177 bird                                397
 191 AppleSpell                          884
 206 cloudd                              425
 282 UserEventAgent                      370
 320 Dock                                376
 353 iconservicesagent                   401
 485 firefox                             668

The julia process has too few open files to be shown here. The left column is the number of files, the rightmost column is the Process ID.

After running w = let ... once:

$ lsof -n +c 0 | sed -E 's/^([^ ]+[ ]+[^ ]+).*$/\1/' | uniq -c | sort | tail
 159 Safari                             1522
 161 com.apple.WebKit.WebContent        1535
 177 bird                                397
 191 AppleSpell                          884
 206 cloudd                              425
 280 julia                              1676
 282 UserEventAgent                      370
 320 Dock                                376
 353 iconservicesagent                   401
 485 firefox                             668

Now julia has 280 open files! After running the piece of code one more time and getting the Too many open files error:

$ lsof -n +c 0 | sed -E 's/^([^ ]+[ ]+[^ ]+).*$/\1/' | uniq -c | sort | tail
 159 Safari                             1522
 161 com.apple.WebKit.WebContent        1535
 177 bird                                397
 191 AppleSpell                          884
 206 cloudd                              425
 282 UserEventAgent                      370
 313 julia                              1676
 320 Dock                                376
 353 iconservicesagent                   401
 485 firefox                             668

Now julia has 313 open files, and I can't run even a single iteration of the loop in the code.

Several minutes later, the output of lsof doesn't contain julia anymore, and I can run my code a couple of times until the inevitable SystemError: Too many open files.

Versions

  • Julia v1.8.5
  • ADNLPModels v0.5.1
  • NLPModelsIpopt v0.10.0
  • macOS 10.15.7

ForceBru avatar Feb 08 '23 18:02 ForceBru