Skip to content

couch_doc_update fails randomly with "Request Timeout" during CI tests

Today we got these mails from rewdevcouch1 and rewdevcouch2:

Output of error log below:

Apr 12 06:08:12    - [rewdevcouch2.rewire.org] err: /Stage[main]/Site_couchdb::Designs/Site_couchdb::Upload_design[shared_transactions]/Exec[upload_design_shared_transactions]/returns: change from notrun to 0 failed: /usr/local/bin/couch-doc-update --host 127.0.0.1:5984 --db 'shared' --id '_design/transactions' --data '{}' --file '/srv/leap/couchdb/designs/shared/transactions.json' returned 1 instead of one of [0] at /srv/leap/puppet/modules/site_couchdb/manifests/upload_design.pp:12
Apr 12 06:08:18  = warning: puppet did not finish successfully. 

-------------------------------------------------------------------

error log: /var/log/leap/rewire/develop/deploy-rewdevcouch2-2015-04-12-060019-error.log
comlete log: /var/log/leap/rewire/develop/deploy-rewdevcouch2-2015-04-12-060019.log

Tested on Sun Apr 12 06:08:19 UTC 2015 on "rewdevcouch2" with following versions/git commit IDs: 

Provider (/home/testbot/platform-test/rewire/develop/rewire): not under version control

 = leap command v1.7.1 (develop ea5be4ea7b6f0b269ac54655f01c7cd6dc28ece7)


 = leap platform v0.7 (develop 051b80d25bd0fe400769976c3210718689f832b4)




Output of error log below:

Apr 12 06:07:53    - [rewdevcouch1.rewire.org] err: /Stage[main]/Site_couchdb::Add_users/Couchdb::Add_user[webapp]/Couchdb::Document[update_user_webapp]/Exec[couch-doc-update --netrc-file /etc/couchdb/couchdb.netrc --host 127.0.0.1:5986 --db _users --id org.couchdb.user:webapp --data '{"type": "user", "name": "webapp", "roles": ["tokens","identities","users"], "password_sha": "e766b8dafbfe871bfa0f9d679f6335a995c26f98", "salt": "edc4c9386cc3ebb8b1c9e1b044dae4ad"}']/returns: change from notrun to 0 failed: couch-doc-update --netrc-file /etc/couchdb/couchdb.netrc --host 127.0.0.1:5986 --db _users --id org.couchdb.user:webapp --data '{"type": "user", "name": "webapp", "roles": ["tokens","identities","users"], "password_sha": "e766b8dafbfe871bfa0f9d679f6335a995c26f98", "salt": "edc4c9386cc3ebb8b1c9e1b044dae4ad"}' returned 1 instead of one of [0] at /srv/leap/puppet/modules/couchdb/manifests/document.pp:24
Apr 12 06:08:19  = warning: puppet did not finish successfully. 

-------------------------------------------------------------------

error log: /var/log/leap/rewire/develop/deploy-rewdevcouch1-2015-04-12-060011-error.log
comlete log: /var/log/leap/rewire/develop/deploy-rewdevcouch1-2015-04-12-060011.log

Tested on Sun Apr 12 06:08:19 UTC 2015 on "rewdevcouch1" with following versions/git commit IDs: 

Provider (/home/testbot/platform-test/rewire/develop/rewire): not under version control

 = leap command v1.7.1 (develop ea5be4ea7b6f0b269ac54655f01c7cd6dc28ece7)


 = leap platform v0.7 (develop 051b80d25bd0fe400769976c3210718689f832b4)


but leap test is all green (i'll open a different issue for this):



Tested on Sun Apr 12 06:12:42 UTC 2015 on these nodes: "rewdevcouch1 rewdevcouch2 rewdevmx1 rewdevvpn1 rewdevweb1 rewdevplain1"
with following versions/git commit IDs: 

Provider (/home/testbot/platform-test/rewire/develop/rewire): not under version control

 = leap command v1.7.1 (develop ea5be4ea7b6f0b269ac54655f01c7cd6dc28ece7)


 = leap platform v0.7 (develop 051b80d25bd0fe400769976c3210718689f832b4)


test-2015-04-12-060001.log


Running leap test on 2015-04-12-061234
Apr 12 06:12:41  = [rewdevcouch2.rewire.org] PASS: Network > Can connect to internet?
Apr 12 06:12:41  = [rewdevcouch2.rewire.org] PASS: Network > Is stunnel running?
Apr 12 06:12:41  = [rewdevcouch2.rewire.org] PASS: Network > Is shorewall running?
Apr 12 06:12:41  = [rewdevcouch2.rewire.org] PASS: CouchDB > Are daemons running?
Apr 12 06:12:41  = [rewdevcouch2.rewire.org] PASS: CouchDB > Is CouchDB running?
Apr 12 06:12:41  = [rewdevcouch2.rewire.org] PASS: CouchDB > Is cluster membership ok?
Apr 12 06:12:41  = [rewdevcouch2.rewire.org] PASS: CouchDB > Are configured nodes online?
Apr 12 06:12:41  = [rewdevcouch2.rewire.org] PASS: CouchDB > Do ACL users exist?
Apr 12 06:12:41  = [rewdevcouch2.rewire.org] PASS: CouchDB > Do required databases exist?
Apr 12 06:12:41  = [rewdevcouch2.rewire.org] PASS: CouchDB > Can records be created?
Apr 12 06:12:41  = [rewdevcouch2.rewire.org] 10 tests: 10 passes, 0 skips, 0 warnings, 0 failures, 0 errors
Apr 12 06:12:41  = [rewdevplain1.rewire.org] PASS: Network > Can connect to internet?
Apr 12 06:12:41  = [rewdevplain1.rewire.org] PASS: Network > Is stunnel running?
Apr 12 06:12:41  = [rewdevplain1.rewire.org] PASS: Network > Is shorewall running?
Apr 12 06:12:41  = [rewdevplain1.rewire.org] 3 tests: 3 passes, 0 skips, 0 warnings, 0 failures, 0 errors
Apr 12 06:12:41  = [rewdevvpn1.rewire.org] PASS: Network > Can connect to internet?
Apr 12 06:12:41  = [rewdevvpn1.rewire.org] PASS: Network > Is stunnel running?
Apr 12 06:12:41  = [rewdevvpn1.rewire.org] PASS: Network > Is shorewall running?
Apr 12 06:12:41  = [rewdevvpn1.rewire.org] PASS: OpenVPN > Are daemons running?
Apr 12 06:12:41  = [rewdevvpn1.rewire.org] 4 tests: 4 passes, 0 skips, 0 warnings, 0 failures, 0 errors
Apr 12 06:12:41  = [rewdevcouch1.rewire.org] PASS: Network > Can connect to internet?
Apr 12 06:12:41  = [rewdevcouch1.rewire.org] PASS: Network > Is stunnel running?
Apr 12 06:12:41  = [rewdevcouch1.rewire.org] PASS: Network > Is shorewall running?
Apr 12 06:12:41  = [rewdevcouch1.rewire.org] PASS: CouchDB > Are daemons running?
Apr 12 06:12:41  = [rewdevcouch1.rewire.org] PASS: CouchDB > Is CouchDB running?
Apr 12 06:12:41  = [rewdevcouch1.rewire.org] PASS: CouchDB > Is cluster membership ok?
Apr 12 06:12:41  = [rewdevcouch1.rewire.org] PASS: CouchDB > Are configured nodes online?
Apr 12 06:12:41  = [rewdevcouch1.rewire.org] PASS: CouchDB > Do ACL users exist?
Apr 12 06:12:41  = [rewdevcouch1.rewire.org] PASS: CouchDB > Do required databases exist?
Apr 12 06:12:41  = [rewdevcouch1.rewire.org] PASS: CouchDB > Can records be created?
Apr 12 06:12:41  = [rewdevcouch1.rewire.org] PASS: Soledad > Is Soledad running?
Apr 12 06:12:41  = [rewdevcouch1.rewire.org] 11 tests: 11 passes, 0 skips, 0 warnings, 0 failures, 0 errors
Apr 12 06:12:41  = [rewdevmx1.rewire.org] PASS: Network > Can connect to internet?
Apr 12 06:12:41  = [rewdevmx1.rewire.org] PASS: Network > Is stunnel running?
Apr 12 06:12:41  = [rewdevmx1.rewire.org] PASS: Network > Is shorewall running?
Apr 12 06:12:41  = [rewdevmx1.rewire.org] PASS: Mx > Can contact couchdb?
Apr 12 06:12:41  = [rewdevmx1.rewire.org] PASS: Mx > Can contact couchdb via haproxy?
Apr 12 06:12:41  = [rewdevmx1.rewire.org] PASS: Mx > Are MX daemons running?
Apr 12 06:12:41  = [rewdevmx1.rewire.org] 6 tests: 6 passes, 0 skips, 0 warnings, 0 failures, 0 errors
Apr 12 06:12:41  = [rewdevweb1.rewire.org] PASS: Network > Can connect to internet?
Apr 12 06:12:41  = [rewdevweb1.rewire.org] PASS: Network > Is stunnel running?
Apr 12 06:12:41  = [rewdevweb1.rewire.org] PASS: Network > Is shorewall running?
Apr 12 06:12:41  = [rewdevweb1.rewire.org] PASS: Webapp > Can contact couchdb?
Apr 12 06:12:41  = [rewdevweb1.rewire.org] PASS: Webapp > Can contact couchdb via haproxy?
Apr 12 06:12:41  = [rewdevweb1.rewire.org] PASS: Webapp > Are daemons running?
Apr 12 06:12:41  = [rewdevweb1.rewire.org] PASS: Webapp > Can access webapp?
Apr 12 06:12:41  = [rewdevweb1.rewire.org] PASS: Webapp > Can create and authenticate and delete user via API?
Apr 12 06:12:41  = [rewdevweb1.rewire.org] PASS: Webapp > Can sync Soledad?
Apr 12 06:12:41  = [rewdevweb1.rewire.org] 9 tests: 9 passes, 0 skips, 0 warnings, 0 failures, 0 errors
OK - "leap test" is all green !

btw, i added the "Exec<||> { logoutput => on_failure }" statement to site_config::default, hoping we get more details in the logs, but this doesn't seem to work - i have an idea why, and am opening another issue.

(from redmine: created on 2015-04-12, closed on 2015-05-15, relates #6851 (closed), relates #6852)