The initial issue that triggered this rework is that the forced storage in
database was working only on writes, and was never applied on attachment
creations.
This feature is used to store small files that need to be read in a fast way in
database rather than in the object storage. Reading a file from the object
storage can take 150-200ms, which is fine for downloading a PDF file or a single
image, but not if you need 40 thumbnails.
Down the path to make a correction, I found that:
* the logic to force storage was called in `_inverse_datas`, which is not called
during a create
* odoo implemented a new method `_get_datas_related_values`, which is a model
method that receive only the data and the mimetype, and return the attachment
values and write the file to the correct place
The `_get_datas_related_values` is where we want to plug this special storage,
as it is called for create and write, and already handle the values and
conditional write. But using this method, we have less information than before
about the attachment, so let's review the different criterias we had before:
* res_model: we were using it to always store attachments related to
'ir.ui.view' in db, because assets are related to this model. However, we
don't really need to check this: we should store any javascript and css
documents in database.
* exclude res_model: we could have an exclusion list, to tell that for instance,
for mail.message, we should never store any image in db. We don't have this
information anymore, but I think it was never used and added "in case of".
Because the default configuration is "mail.mail" and "mail.message" and I
couldn't find any attachment with such res_model in any of our biggest
databases. So this is removed.
* mimetype and data (size) are the last criteria and we still have them
The new system is only based on mimetype and data size and I think it's actually
more versatile. Previously, we could set a global size and include mimetypes,
but we couldn't say "I want to store all images below 50KB and all files of type
X below 10KB". Now, we have a single system parameter with a dict configuration
(`ir_attachment.storage.force.database`) defaulting to:
{"image/": 51200, "application/javascript": 0, "text/css": 0}
Assets have a limit of zero, which means they will all be stored in the database
whatever their size is.
Overall, this is a great simplification of the module too, as the method
`_get_datas_related_values` integrates it better in the base calls of IrAttachment.
Note for upgrade:
I doubt we customized the previous system parameters which are now obsolete, but
if yes, the configuration may need to be moved to `ir_attachment.storage.force.database`.
For the record, the params were:
* mimetypes.list.storedb (default: image)
* file.maxsize.storedb (default: 51200)
* excluded.models.storedb (mail.message,mail.mail), no equivalent now
The method IrAttachment.force_storage_to_db_for_special_fields() should be called
through a migration script on existing databases to move the attachments back into
the database.
Some attachments (e.g. image_small, image_medium) are stored in DB
instead of the object storage for faster access.
In some situations, we may have pushed all these files on the Object
Storage (migration from a filesystem to object storage) and want to
bring back these attachments from the object storage to the database.
This method is not called anywhere but can be called by RPC or scripts.
The labs env can be anything starting by 'labs', such as
'labs-logistics', 'labs-finance', ...
* At install, s3/swift is set as default storage
* However, unlike prod/integration, the storage is not forced to be an
object storage
* Redis is required
* When the storage is set on s3/swift, then the bucket name is mandatory
(otherwise, there is no place where to create the files...)
The redis prefix regex match is relaxed: anything starting by a project
name, then '-odoo-', then any combination of letters, digits, and dashes
is accepted (so a prefix my-project9-odoo-labs-web3 is valid).
The name of the libs and python packages are different, Odoo expects
the inner python package in the manifest, but setuptools cannot find the
libs in pypi, overrides them with the libs names.
OVH's Swift applies a rate limit on the authentication.
attachment_swift authenticates again each time it has to read/write an
attachment. When running upgrades on upgrades of files or installing a
new DB, at some point, we get rejected with HTTP 429.
This commit introduces a shared storage for Swift Session. All
connections will reuses the same authentication token created the first
time a connection needs a Session.
Note: needs python-swiftclient>=3.7.0 to have
https://github.com/openstack/python-swiftclient/commit/1971ef880ff225379d4a91f00f89f323a1605eeb
Assume the following situation:
* We have installed addons base, sale and attachment_s3 (hence
base_attachment_object_storage as dependency)
* All attachments are in S3 already
* We run an upgrade of the 'base' addon, 'sale' is upgraded before
attachment_s3 in the order of loading.
* Sale updates the icon of the Sale menu
* As attachment_s3 is not loaded yet, the attachment is created in the
filestore
Now if we don't persist the filestore or use different servers, we'll
lose the images of the menus (or any attachment loaded by the
install/upgrade of an addon).
The implemented solution is to move the attachments from the filestore
to the object storage at the loading of the module. However, this
operation can take time and it shouldn't be run by 2 processes at the
same time, so we want to detect if the module is loaded during a normal odoo
startup or when some addons have been upgraded. There is nothing anymore
at this point which allow us to know that modules just have been
upgraded except... in the caller frame (load_modules). We have to rely
on the inpect module and get the caller frame, which is not recommended,
but seems the only way, besides, it's not called often and if
_register_hook was called from another place, it would have no effect
(unless the other place has a variable 'update_module' too).
When moving attachments from the filestore to an object storage, the
filesystem files will be deleted only after the commit, so if the
transaction is rollbacked, we still have the local files for another
try.
The default expiration of sessions is 7 days. With healthchecks run
every few seconds, we quickly have millions of anonymous sessions in
Redis. Allow to define a custom expiration for some sessions and set a
very short one for the monitoring requests.
The previous error message let think that you should set AWS_BUCKETNAME,
although you should set it only if you are trying to write in this
repository.
An Object Storage read is slower than a disk of database access.
It might take ~200 to 300ms to retrieve a file content.
This is not an issue for attachments such as the pdf files or any
attachment that we want to read on demand. But that's too slow for
files needed to render a web page.
We'll store in the database:
* Assets (js, css, ...). As a side effect, the databases will be more
portable, as assets are rebuilt frequently, storing them in the Object
Storage led the integration server to try to read assets deleted since
long ago
* Attachments linked to Binary fields named 'image_small',
'image_medium', 'web_icon_data'. Those fields are often used on kanban
views that display a lot a images and retrieving them all was then
very slow (Odoo does not do async requests).
The migration to S3 is no longer called during initialization of the
registry: it would be too slow as we would have to define if the
attachments must be kept in database or sent to S3 on each new start. It
means we have to call `env['ir.attachment'].force_storage()` to run the
migration.
Because it provokes serialization errors during the installation or
update of addons. Do not commit as we might commit unwanted things...
Later, we might want to add a specific, more elaborate, migration
process.
* store the S3 uri in the 'store_fname' (s3://bucket/key)
* the read-only mode is now built-in, as we store the bucket name, if a
instance is started with a different bucket or another filestore
method, it will continue to read the previous attachments on their
stored bucket, but new attachments will be stored on the new one
* remove config in ir.config_parameter, it makes all the stuff more
complex and we don't use them (config file would be more interesting)
* automatically migrate the attachments on loading of the server, so
if an ir.attachment has been created during the module
upgrade/initialization before attachment_s3 is loaded, it will be sent
to S3 as soon as it's loaded