Edit this page | Blame

Cannot Connect to MariaDB

Description

GeneNetwork3 is failing to connect to mariadb with the error:

⋮
2024-11-05 14:49:00 Traceback (most recent call last):
2024-11-05 14:49:00   File "/gnu/store/83v79izrqn36nbn0l1msbcxa126v21nz-profile/lib/python3.10/site-packages/flask/app.py", line 1523, in full_dispatch_request
2024-11-05 14:49:00     rv = self.dispatch_request()
2024-11-05 14:49:00   File "/gnu/store/83v79izrqn36nbn0l1msbcxa126v21nz-profile/lib/python3.10/site-packages/flask/app.py", line 1509, in dispatch_request
2024-11-05 14:49:00     return self.ensure_sync(self.view_functions[rule.endpoint])(**req.view_args)
2024-11-05 14:49:00   File "/gnu/store/83v79izrqn36nbn0l1msbcxa126v21nz-profile/lib/python3.10/site-packages/gn3/api/menu.py", line 13, in generate_json
2024-11-05 14:49:00     with database_connection(current_app.config["SQL_URI"], logger=current_app.logger) as conn:
2024-11-05 14:49:00   File "/gnu/store/lzw93sik90d780n09svjx5la1bb8g3df-python-3.10.7/lib/python3.10/contextlib.py", line 135, in __enter__
2024-11-05 14:49:00     return next(self.gen)
2024-11-05 14:49:00   File "/gnu/store/83v79izrqn36nbn0l1msbcxa126v21nz-profile/lib/python3.10/site-packages/gn3/db_utils.py", line 34, in database_connection
2024-11-05 14:49:00     connection = mdb.connect(db=db_name,
2024-11-05 14:49:00   File "/gnu/store/83v79izrqn36nbn0l1msbcxa126v21nz-profile/lib/python3.10/site-packages/MySQLdb/__init__.py", line 121, in Connect
2024-11-05 14:49:00     return Connection(*args, **kwargs)
2024-11-05 14:49:00   File "/gnu/store/83v79izrqn36nbn0l1msbcxa126v21nz-profile/lib/python3.10/site-packages/MySQLdb/connections.py", line 195, in __init__
2024-11-05 14:49:00     super().__init__(*args, **kwargs2)
2024-11-05 14:49:00 MySQLdb.OperationalError: (2002, "Can't connect to local MySQL server through socket '/tmp/mysql.sock' (2)")

We have previously defined the default socket file[^1][^2] as "/run/mysqld/mysqld.sock".

Troubleshooting Logs

2024-11-05

I attempted to just bind `/run/mysqld/mysqld.sock` to `/tmp/mysql.sock` by adding the following mapping in GN3's `gunicorn-app` definition:

(file-system-mapping
 (source "/run/mysqld/mysqld.sock")
 (target "/tmp/mysql.sock")
 (writable? #t))

but that does not fix things.

I had tried to change the mysql URI to use IP addresses, i.e.

SQL_URI="mysql://webqtlout:webqtlout@128.169.5.119:3306/db_webqtl"

but that simply changes the error from the above to the one below:

2024-11-05 15:27:12 MySQLdb.OperationalError: (2002, "Can't connect to MySQL server on '128.169.5.119' (115)")

I tried with both `127.0.0.1` and `128.169.5.119`.

My hail-mary was to attempt to expose the `my.cnf` file generated by the `mysql-service-type` definition to the "pola-wrapper", but that is proving tricky, seeing as the file is generated elsewhere[^4] and we do not have a way of figuring out the actual final path of the file.

I tried:

(file-system-mapping
 (source (mixed-text-file "my.cnf"
                          (string-append "[client]\n"
                                         "socket=/run/mysqld/mysqld.sock")))
 (target "/etc/mysql/my.cnf"))

but that did not work either.

2024-11-07

Start digging into how GNU Guix services are defined[^5] to try and understand why the file mapping attempt did not work.

Looking at the code linked above specifically at lines 575 to 588, and 166, it seems, to me, that the mappings attempt should have worked.

Try it again, taking care to verify that the paths are correct, with:

(file-system-mapping
 (source (mixed-text-file "my.cnf"
                          (string-append "[client-server]\n"
                                         "socket=/run/mysqld/mysqld.sock")))
 (target "/etc/my.cnf"))

Try rebuilding on tux04: started getting `Segmentation fault` errors out of the blue for many guix commands 🤦🏿. Try building container on local dev machine: this took a long time - quit and continue later.

2024-11-08

After guix broke, causing the `Segmentation fault` errors above, I did some troubleshooting and was able to finally fix that by pinning guix to version b0b988c41c9e0e591274495a1b2d6f27fcdae15a as shown in the troubleshooting transcript[^6].

Now the fixes I did to make python requests work with the newer guix (defined in guix-bioinformatics[^7]) seem to be leading to failures in the older guix version.

Let me attempt rebasing to reorder the commits, to make the python requests commit come last, to more easily do a `git reset` before rebuilding the container — not successful.

then rebuild the container. This exposes a bug in gn-auth.

and update the `public-jwks-uri` value for the client in the admin dashboard, and voila!!! Now the system works.

Attempt pulling guix "2394a7f5fbf60dd6adc0a870366adb57166b6d8b" into a profile locally: went through without a hitch

Upgrade guix daemon, and restart it. Delete profile and run `guix gc`, then try pulling guix "2394a7f5fbf60dd6adc0a870366adb57166b6d8b" again. It also went through without a problem. This eliminates the daemon being the culprit: Running `sudo -i guix pull --list-generations` on both tux04 and my local dev machine gives both daemon commits as `2a6d96425eea57dc6dd48a2bec16743046e32e06`.

Footnotes

(made with skribilo)