pushmeup icon indicating copy to clipboard operation
pushmeup copied to clipboard

Persistence not working properly

Open kert-io opened this issue 10 years ago • 18 comments

This is a chunk from my irb test. Can you tell me what I might be doing wrong or is the close method in conflict with persistence?

irb(main):007:0>APNS.start_persistence => true irb(main):008:0> APNS.send_notification(device_token, 'tell me if you get this') NoMethodError: undefined method close' for nil:NilClass from /home/guruninja/.rbenv/versions/2.1.2/lib/ruby/gems/2.1.0/gems/pushmeup-0.3.0/lib/pushmeup/apns/core.rb:83:inrescue in with_connection' from /home/guruninja/.rbenv/versions/2.1.2/lib/ruby/gems/2.1.0/gems/pushmeup-0.3.0/lib/pushmeup/apns/core.rb:72:in `with_connection'

kert-io avatar Oct 05 '14 00:10 kert-io

i'm getting the same error but without using APNS.start_persistence

Goltergaul avatar Oct 23 '14 15:10 Goltergaul

I solve this by not testing in irb. Had it running in production now for two months and it works great. why persistence doesn't work in irb I never tested.

On Mon, Dec 15, 2014 at 3:36 PM, Matias [email protected] wrote:

Any clues of how to solve this?

— Reply to this email directly or view it on GitHub https://github.com/NicosKaralis/pushmeup/issues/45#issuecomment-67060855 .

kert-io avatar Dec 21 '14 01:12 kert-io

This does occur for me in production from background sidekiq workers. Subsequent retries are successful.

astjohn avatar Dec 24 '14 15:12 astjohn

First and foremost, what if any error message do you get?

Where are your setting it to persist? What does your over all worker config look like? When is this set?

And then, does a call to your push worker look like?

On Wed, Dec 24, 2014 at 10:22 AM, Adam St. John [email protected] wrote:

This does occur for me in production from background sidekiq workers. Subsequent retries are successful.

— Reply to this email directly or view it on GitHub https://github.com/NicosKaralis/pushmeup/issues/45#issuecomment-68058187 .

kert-io avatar Dec 24 '14 18:12 kert-io

It's the same error as what @prockport has mentioned. It would seem that the @ssl object is nil during the stop_persistence method.

backtrace:

/bundler/gems/pushmeup-9a95b551296d/lib/pushmeup/apns/core.rb:31 in "stop_persistence"
/app/jobs/sitata_jobs/push_notifications_job.rb:36 in "push_ios"
/app/jobs/sitata_jobs/push_notifications_job.rb:13 in "perform"
/gems/sidekiq-3.3.0/lib/sidekiq/processor.rb:75 in "execute_job"
/gems/sidekiq-3.3.0/lib/sidekiq/processor.rb:52 in "process"
org/jruby/RubyProc.java:271 in "call"
/gems/sidekiq-3.3.0/lib/sidekiq/middleware/chain.rb:127 in "invoke"
/gems/newrelic_rpm-3.9.8.273/lib/new_relic/agent/instrumentation/sidekiq.rb:33 in "call"
/gems/newrelic_rpm-3.9.8.273/lib/new_relic/agent/instrumentation/controller_instrumentation.rb:366 in "perform_action_with_newrelic_trace"
/gems/newrelic_rpm-3.9.8.273/lib/new_relic/agent/instrumentation/sidekiq.rb:29 in "call"
/gems/sidekiq-3.3.0/lib/sidekiq/middleware/chain.rb:129 in "invoke"
/gems/sidekiq-3.3.0/lib/sidekiq/middleware/server/retry_jobs.rb:74 in "call"
/gems/sidekiq-3.3.0/lib/sidekiq/middleware/chain.rb:129 in "invoke"
/gems/sidekiq-3.3.0/lib/sidekiq/middleware/server/logging.rb:11 in "call"
/gems/sidekiq-3.3.0/lib/sidekiq/logging.rb:22 in "with_context"
/gems/sidekiq-3.3.0/lib/sidekiq/middleware/server/logging.rb:7 in "call"
/gems/sidekiq-3.3.0/lib/sidekiq/middleware/chain.rb:129 in "invoke"
org/jruby/RubyProc.java:271 in "call"
/gems/sidekiq-3.3.0/lib/sidekiq/middleware/chain.rb:132 in "invoke"
/gems/sidekiq-3.3.0/lib/sidekiq/processor.rb:51 in "process"
/gems/sidekiq-3.3.0/lib/sidekiq/processor.rb:98 in "stats"
/gems/sidekiq-3.3.0/lib/sidekiq/processor.rb:50 in "process"
org/jruby/RubyKernel.java:1958 in "public_send"
/gems/celluloid-0.16.0/lib/celluloid/calls.rb:26 in "dispatch"
/gems/celluloid-0.16.0/lib/celluloid/calls.rb:122 in "dispatch"
/gems/celluloid-0.16.0/lib/celluloid/cell.rb:60 in "invoke"
/gems/celluloid-0.16.0/lib/celluloid/cell.rb:71 in "task"
/gems/celluloid-0.16.0/lib/celluloid/actor.rb:357 in "task"
/gems/celluloid-0.16.0/lib/celluloid/tasks.rb:57 in "initialize"
/gems/celluloid-0.16.0/lib/celluloid/tasks.rb:47 in "initialize"
/gems/celluloid-0.16.0/lib/celluloid/tasks/task_fiber.rb:15 in "create"

Our push notification method is abstracted to a high level:

def push_ios
    APNS.start_persistence
    with_devices(:ios) do |token|
      APNS.send_notification(token, @notification.ios_push_message)
    end
    APNS.stop_persistence
end

There is nothing abnormal about the procedure or the message we are sending. Telnetting into APNS manually works fine. Subsequent retries by the sidekiq worker also work so the problem seems intermittent. This is on jRuby 1.7.18.

astjohn avatar Dec 24 '14 23:12 astjohn

I have a separate worker class just for push. I open a persistent connection for the class and leave it open. This should be done along with your other configs like APNS.pem etc. Then, any call to the send_notification method of the push class runs in the open persistent connection.

Basically, take the start_persistent out of the method call. I use sidekiq for all push, and it works with the setup described above. It has been running for months now without skipping a beat.

On Wed, Dec 24, 2014 at 6:24 PM, Adam St. John [email protected] wrote:

It's the same error as what @prockport https://github.com/prockport has mentioned. It would seem that the @ssl object is nil during the stop_persistence method.

backtrace:

/bundler/gems/pushmeup-9a95b551296d/lib/pushmeup/apns/core.rb:31 in "stop_persistence" /app/jobs/sitata_jobs/push_notifications_job.rb:36 in "push_ios" /app/jobs/sitata_jobs/push_notifications_job.rb:13 in "perform" /gems/sidekiq-3.3.0/lib/sidekiq/processor.rb:75 in "execute_job" /gems/sidekiq-3.3.0/lib/sidekiq/processor.rb:52 in "process" org/jruby/RubyProc.java:271 in "call" /gems/sidekiq-3.3.0/lib/sidekiq/middleware/chain.rb:127 in "invoke" /gems/newrelic_rpm-3.9.8.273/lib/new_relic/agent/instrumentation/sidekiq.rb:33 in "call" /gems/newrelic_rpm-3.9.8.273/lib/new_relic/agent/instrumentation/controller_instrumentation.rb:366 in "perform_action_with_newrelic_trace" /gems/newrelic_rpm-3.9.8.273/lib/new_relic/agent/instrumentation/sidekiq.rb:29 in "call" /gems/sidekiq-3.3.0/lib/sidekiq/middleware/chain.rb:129 in "invoke" /gems/sidekiq-3.3.0/lib/sidekiq/middleware/server/retry_jobs.rb:74 in "call" /gems/sidekiq-3.3.0/lib/sidekiq/middleware/chain.rb:129 in "invoke" /gems/sidekiq-3.3.0/lib/sidekiq/middleware/server/logging.rb:11 in "call" /gems/sidekiq-3.3.0/lib/sidekiq/logging.rb:22 in "with_context" /gems/sidekiq-3.3.0/lib/sidekiq/middleware/server/logging.rb:7 in "call" /gems/sidekiq-3.3.0/lib/sidekiq/middleware/chain.rb:129 in "invoke" org/jruby/RubyProc.java:271 in "call" /gems/sidekiq-3.3.0/lib/sidekiq/middleware/chain.rb:132 in "invoke" /gems/sidekiq-3.3.0/lib/sidekiq/processor.rb:51 in "process" /gems/sidekiq-3.3.0/lib/sidekiq/processor.rb:98 in "stats" /gems/sidekiq-3.3.0/lib/sidekiq/processor.rb:50 in "process" org/jruby/RubyKernel.java:1958 in "public_send" /gems/celluloid-0.16.0/lib/celluloid/calls.rb:26 in "dispatch" /gems/celluloid-0.16.0/lib/celluloid/calls.rb:122 in "dispatch" /gems/celluloid-0.16.0/lib/celluloid/cell.rb:60 in "invoke" /gems/celluloid-0.16.0/lib/celluloid/cell.rb:71 in "task" /gems/celluloid-0.16.0/lib/celluloid/actor.rb:357 in "task" /gems/celluloid-0.16.0/lib/celluloid/tasks.rb:57 in "initialize" /gems/celluloid-0.16.0/lib/celluloid/tasks.rb:47 in "initialize" /gems/celluloid-0.16.0/lib/celluloid/tasks/task_fiber.rb:15 in "create"

Our push notification method is abstracted to a high level:

def push_ios APNS.start_persistence with_devices(:ios) do |token| APNS.send_notification(token, @notification.ios_push_message) end APNS.stop_persistence end

There is nothing abnormal about the procedure or the message we are sending. Telnetting into APNS manually works fine. Subsequent retries by the sidekiq worker also work so the problem seems intermittent. This is on jRuby 1.7.18.

— Reply to this email directly or view it on GitHub https://github.com/NicosKaralis/pushmeup/issues/45#issuecomment-68078389 .

kert-io avatar Dec 25 '14 04:12 kert-io

@prockport, that makes sense and so I will most likely restructure our sidekiq worker. That said, I think the root of this problem might be the fact that @sock, @ssl, etc. are module variables. They are not currently threadsafe. Therefore, when we had multithreaded workers executing, @ssl was set to nil by the second thread before the first thread had a chance to finish. Yes, opening up a persistent connection for the worker class does solve the issue, but it doesn't solve the underlying issue. I suggest restructuring or leaving some instructions in the README about how to use the current API in a threadsafe manner. Otherwise, I fear this issue will be raised by future users.

astjohn avatar Jan 01 '15 16:01 astjohn

I'm really with a lot of work at the moment but I want to completely redesign this gem, one of the things that bugs me a lot is this problem you are having.

I don't understand why would you need to open a connection from one thread and keep it open for a very long time

this code:

def push_ios
    APNS.start_persistence
    with_devices(:ios) do |token|
      APNS.send_notification(token, @notification.ios_push_message)
    end
    APNS.stop_persistence
end

is the same as this:

def push_ios
  notifications = []
  with_devices(:ios) do |token|
    notifications << APNS::Notification.new(token, @notification.ios_push_message)
  end
  APNS.send_notifications(notifications)
end

And if you still need to open a persistent connection because one method opens the connection and another closes it, maybe you shouldn't do that, especially when you work in threads like sidekiq or resque workers

NicosKaralis avatar Jan 19 '15 05:01 NicosKaralis

@NicosKaralis, I took a stab at redesigning the gem based on feedback and pull requests from others.

Please take a look here: https://github.com/Sitata/pushmeup/tree/ios8

It uses a new structure with a Gateway class. In this way, each gateway can have it's own scope on api keys and such in case the server needs to send notifications for multiple apps. This was suggested by @pepibumur, but I thought the class Gateway was more appropriate that Application.

With a Gateway class, I think many of the threading issues have been addressed. We can open and close the apns connection for as long as needed. I also happen to disagree with keeping a persistent connection open forever.

I also introduced a configuration pattern seen in many other gems in which you would setup everything in a rails initializer.

I have also included a few changes to address iOS 8 issues.

Usage is as follows:

# /config/initializers/pushmeup.rb
Pushmeup.configure do |config|

  # Setup default stuff for gateways - can be overriden
  if Rails.env.production?
    config.apns_pem  = Figaro.env.apns_pem
    config.apns_pass = Figaro.env.apns_pass
  else
    config.apns_pem  = File.join(Rails.root, "config", "apns", Figaro.env.apns_pem)
    config.apns_host = "gateway.sandbox.push.apple.com"
  end

  config.apns_pass = Figaro.env.apns_pass
  config.gcm_key   = Figaro.env.google_api_key
end
    # Wherever you need to push to apns
    # IOS can keep a persistent connection and so we use that to send to all users
    # at a single time
    def push_ios
      g = Pushmeup::APNS::Gateway.new(persistent: true)

      with_devices(:ios) do |token|
        g.send_notification(token, @notification.ios_push_message)
      end

      g.close
    end

We've been using this in production for gps and apns. Please check it out and let me know what you think. Amazon has not been tested.

@prockport - If you want me to submit a pull for this big re-write, just let me know and I'll try to find some time to improve the specs, which are largely empty at this point.

astjohn avatar Jan 19 '15 14:01 astjohn

Excellent job @astjohn. It is exactly what I need to grab this gem as solution to our platform. Do you have intention to work in Amazon refactory?

maxintech avatar Jan 23 '15 22:01 maxintech

@maxintech, Not completely sure what you mean by Amazon refactory, but I did refactor all of the current providers.

astjohn avatar Jan 24 '15 01:01 astjohn

I'm sorry, I misread the last sentence "Amazon has not been tested.". Never mind... I will take a chance next week and I let you know if is working in Amazon too.

maxintech avatar Jan 24 '15 21:01 maxintech

@maxintech - sounds great. Let me know how it goes.

astjohn avatar Jan 25 '15 00:01 astjohn

Hey @astjohn, the lib works fine in all three providers. One downside for our application is in APNS you can only pass a full path for the certificate in PEM format. The same behaviour is in the original gem. For our solution is kind of bummer because we already have the PEMs in memory and we cannot access to a physical file in the machine the application runs.

maxintech avatar Feb 12 '15 18:02 maxintech

@maxintech - how do you place the PEM in memory?

astjohn avatar Feb 12 '15 18:02 astjohn

@astjohn: To be clear. Our application is a Gaming Platform. We serve more than one application. All the configuration of each application, including Push Notification and IAP keys and certificates are in the database. The first time the platform needs some attribute of some application reads the database and cache the configuration in a memcache.

maxintech avatar Feb 12 '15 18:02 maxintech

@maxintech, interesting. Well, if you want to take a kick at it, I could see a simple solution as making a set_pem method on the gateway that sets @pem_file directly so that this method returns your value instead of File.read(pem). Feel free to open an isssue on the Sitata branch/tracker and we can take a look at it further over there.

I still haven't heard anything from @prockport regarding pulling the branch into this repo.

astjohn avatar Feb 12 '15 18:02 astjohn

Thank you @astjohn. I will add the issue on the Sitata branch/tracker.

maxintech avatar Feb 12 '15 21:02 maxintech