Rework and fix storage forced in database

The initial issue that triggered this rework is that the forced storage in
database was working only on writes, and was never applied on attachment
creations.

This feature is used to store small files that need to be read in a fast way in
database rather than in the object storage. Reading a file from the object
storage can take 150-200ms, which is fine for downloading a PDF file or a single
image, but not if you need 40 thumbnails.

Down the path to make a correction, I found that:

* the logic to force storage was called in `_inverse_datas`, which is not called
  during a create
* odoo implemented a new method `_get_datas_related_values`, which is a model
  method that receive only the data and the mimetype, and return the attachment
  values and write the file to the correct place

The `_get_datas_related_values` is where we want to plug this special storage,
as it is called for create and write, and already handle the values and
conditional write. But using this method, we have less information than before
about the attachment, so let's review the different criterias we had before:

* res_model: we were using it to always store attachments related to
  'ir.ui.view' in db, because assets are related to this model. However, we
  don't really need to check this: we should store any javascript and css
  documents in database.
* exclude res_model: we could have an exclusion list, to tell that for instance,
  for mail.message, we should never store any image in db. We don't have this
  information anymore, but I think it was never used and added "in case of".
  Because the default configuration is "mail.mail" and "mail.message" and I
  couldn't find any attachment with such res_model in any of our biggest
  databases. So this is removed.
* mimetype and data (size) are the last criteria and we still have them

The new system is only based on mimetype and data size and I think it's actually
more versatile. Previously, we could set a global size and include mimetypes,
but we couldn't say "I want to store all images below 50KB and all files of type
X below 10KB". Now, we have a single system parameter with a dict configuration
(`ir_attachment.storage.force.database`) defaulting to:

    {"image/": 51200, "application/javascript": 0, "text/css": 0}

Assets have a limit of zero, which means they will all be stored in the database
whatever their size is.

Overall, this is a great simplification of the module too, as the method
`_get_datas_related_values` integrates it better in the base calls of IrAttachment.

Note for upgrade:

I doubt we customized the previous system parameters which are now obsolete, but
if yes, the configuration may need to be moved to `ir_attachment.storage.force.database`.
For the record, the params were:

* mimetypes.list.storedb (default: image)
* file.maxsize.storedb (default: 51200)
* excluded.models.storedb (mail.message,mail.mail), no equivalent now

The method IrAttachment.force_storage_to_db_for_special_fields() should be called
through a migration script on existing databases to move the attachments back into
the database.
This commit is contained in:
Guewen Baconnier
2020-05-28 07:26:42 +02:00
parent a2e4c8fc65
commit e3e3899143
5 changed files with 141 additions and 96 deletions
+4
View File
@@ -34,6 +34,10 @@ This addon must be added in the server wide addons with (``--load`` option):
``--load=web,attachment_s3``
The System Parameter ``ir_attachment.storage.force.database`` can be customized to
force storage of files in the database. See the documentation of the module
``base_attachment_object_storage``.
Limitations
-----------
+4
View File
@@ -34,6 +34,10 @@ This addon must be added in the server wide addons with (``--load`` option):
``--load=web,attachment_swift``
The System Parameter ``ir_attachment.storage.force.database`` can be customized to
force storage of files in the database. See the documentation of the module
``base_attachment_object_storage``.
Python Dependencies
-------------------
+33
View File
@@ -3,5 +3,38 @@ Base class for attachments on external object store
This is a base addon that regroup common code used by addons targeting specific object store
Configuration
-------------
Object storage may be slow, and for this reason, we want to store
some files in the database whatever.
Small images (128, 256) are used in Odoo in list / kanban views. We
want them to be fast to read.
They are generally < 50KB (default configuration) so they don't take
that much space in database, but they'll be read much faster than from
the object storage.
The assets (application/javascript, text/css) are stored in database
as well whatever their size is:
* a database doesn't have thousands of them
* of course better for performance
* better portability of a database: when replicating a production
instance for dev, the assets are included
This storage configuration can be modified in the system parameter
``ir_attachment.storage.force.database``, as a JSON value, for instance::
{"image/": 51200, "application/javascript": 0, "text/css": 0}
Where the key is the beginning of the mimetype to configure and the
value is the limit in size below which attachments are kept in DB.
0 means no limit.
Default configuration means:
* images mimetypes (image/png, image/jpeg, ...) below 50KB are
stored in database
* application/javascript are stored in database whatever their size
* text/css are stored in database whatever their size
@@ -1,16 +1,9 @@
<?xml version='1.0' encoding='utf-8'?>
<odoo noupdate="1">
<record id="default_mimes_type_storedb" model="ir.config_parameter">
<field name="key">mimetypes.list.storedb</field>
<field name="value">image</field>
</record>
<record id="default_file_maxsize_storedb" model="ir.config_parameter">
<field name="key">file.maxsize.storedb</field>
<field name="value">50000</field>
</record>
<record id="excluded_model_storedb" model="ir.config_parameter">
<field name="key">excluded.models.storedb</field>
<field name="value">mail.message,mail.mail</field>
<record id="ir_attachment_storage_force_database" model="ir.config_parameter">
<field name="key">ir_attachment.storage.force.database</field>
<field name="value">{"image/": 51200, "application/javascript": 0, "text/css": 0}</field>
</record>
</odoo>
@@ -12,8 +12,8 @@ import odoo
from contextlib import closing, contextmanager
from odoo import api, exceptions, models, _
from odoo.tools.mimetypes import guess_mimetype
from odoo.osv.expression import AND, normalize_domain
from odoo.osv.expression import AND, OR, normalize_domain
from odoo.tools.safe_eval import const_eval
_logger = logging.getLogger(__name__)
@@ -68,109 +68,114 @@ class IrAttachment(models.Model):
if update_module:
self.env['ir.attachment'].sudo()._force_storage_to_object_storage()
@api.model
def _save_in_db_domain(self):
@property
def _object_storage_default_force_db_config(self):
return {"image/": 51200, "application/javascript": 0, "text/css": 0}
def _get_storage_force_db_config(self):
param = self.env['ir.config_parameter'].sudo().get_param(
'ir_attachment.storage.force.database',
)
storage_config = None
if param:
try:
storage_config = const_eval(param)
except (SyntaxError, TypeError, ValueError):
_logger.exception(
"Could not parse system parameter"
" 'ir_attachment.storage.force.database', reverting to the"
" default configuration.")
if not storage_config:
storage_config = self._object_storage_default_force_db_config
return storage_config
def _store_in_db_instead_of_object_storage_domain(self):
"""Return a domain for attachments that must be forced to DB
Read the docstring of ``_save_in_db_anyway`` for more details.
Read the docstring of ``_store_in_db_instead_of_object_storage`` for
more details.
Used in ``force_storage_to_db_for_special_fields`` to find records
to move from the object storage to the database.
The domain must be inline with the conditions in
``_save_in_db_anyway``.
``_store_in_db_instead_of_object_storage``.
"""
excluded_model_settings = self.env['ir.config_parameter'].sudo().\
get_param('excluded.models.storedb', default='')
excluded_model_for_db_store = excluded_model_settings.split(',')
mimetypes_settings = self.env['ir.config_parameter'].sudo().get_param(
'mimetypes.list.storedb', default='')
mimetypes_for_db_store = mimetypes_settings.split(',')
filesize = self.env['ir.config_parameter'].sudo().get_param(
'file.maxsize.storedb', default='0')
domain = [
'|',
# assets are stored in 'ir.ui.view'
('res_model', '=', 'ir.ui.view'),
'&', '&',
('file_size', '<', int(filesize)),
('res_model', 'not in', excluded_model_for_db_store),
]
domain += ['|'] * (len(mimetypes_for_db_store) - 1)
domain += [('mimetype', '=like', mimetype) for mimetype in
mimetypes_for_db_store]
domain = []
storage_config = self._get_storage_force_db_config()
for mimetype_key, limit in storage_config.items():
part = [("mimetype", "=like", "{}%".format(mimetype_key))]
if limit:
part = AND([part, [("file_size", "<=", limit)]])
domain = OR([domain, part])
return domain
def _save_in_db_anyway(self):
def _store_in_db_instead_of_object_storage(self, data, mimetype):
""" Return whether an attachment must be stored in db
When we are using an Object Store. This is sometimes required
When we are using an Object Storage. This is sometimes required
because the object storage is slower than the database/filesystem.
We store image_small and image_medium from 'Binary' fields
because they should be fast to read as they are often displayed
in kanbans / lists. The same for web_icon_data.
Small images (128, 256) are used in Odoo in list / kanban views. We
want them to be fast to read.
They are generally < 50KB (default configuration) so they don't take
that much space in database, but they'll be read much faster than from
the object storage.
We store the assets locally as well. Not only for performance,
but also because it improves the portability of the database:
when assets are invalidated, they are deleted so we don't have
an old database with attachments pointing to deleted assets.
The assets (application/javascript, text/css) are stored in database
as well whatever their size is:
* a database doesn't have thousands of them
* of course better for performance
* better portability of a database: when replicating a production
instance for dev, the assets are included
The configuration can be modified in the ir.config_parameter
``ir_attachment.storage.force.database``, as a dictionary, for
instance::
{"image/": 51200, "application/javascript": 0, "text/css": 0}
Where the key is the beginning of the mimetype to configure and the
value is the limit in size below which attachments are kept in DB.
0 means no limit.
Default configuration means:
* images mimetypes (image/png, image/jpeg, ...) below 51200 bytes are
stored in database
* application/javascript are stored in database whatever their size
* text/css are stored in database whatever their size
The conditions must be inline with the domain in
``_save_in_db_domain``.
``_store_in_db_instead_of_object_storage_domain``.
"""
self.ensure_one()
# Note: we cannot use _save_in_db_domain because we can be working
# with new records here. The conditions must stay inline though.
# assets
if self.res_model == 'ir.ui.view':
# assets are stored in 'ir.ui.view'
return True
# Check if model must never be stored on DB
excluded_model_settings = self.env['ir.config_parameter'].sudo().\
get_param('excluded.models.storedb', default='')
excluded_model_for_db_store = excluded_model_settings.split(',')
if self.res_model in excluded_model_for_db_store:
return False
# Check if file size and mimetype fit requirements
data_to_store = self.datas
bin_data = base64.b64decode(data_to_store) if data_to_store else ''
current_mimetype = guess_mimetype(bin_data)
mimetypes_settings = self.env['ir.config_parameter'].sudo().get_param(
'mimetypes.list.storedb', default='')
mimetypes_for_db_store = mimetypes_settings.split(',')
if any(current_mimetype.startswith(val) for val in
mimetypes_for_db_store):
# get allowed size
filesize = self.env['ir.config_parameter'].sudo().get_param(
'file.maxsize.storedb', default='0')
if len(bin_data) < int(filesize):
return True
storage_config = self._get_storage_force_db_config()
for mimetype_key, limit in storage_config.items():
if mimetype.startswith(mimetype_key):
if not limit:
return True
bin_data = base64.b64decode(data) if data else b''
return len(bin_data) <= limit
return False
def _inverse_datas(self):
# override in order to store files that need fast access,
# we keep them in the database instead of the object storage
location = self.env.context.get('storage_location') or self._storage()
for attach in self:
if location in self._get_stores() and attach._save_in_db_anyway():
def _get_datas_related_values(self, data, mimetype):
storage = self.env.context.get('storage_location') or self._storage()
if data and storage in self._get_stores():
if self._store_in_db_instead_of_object_storage(data, mimetype):
# compute the fields that depend on datas
value = attach.datas
bin_data = base64.b64decode(value) if value else ''
vals = {
bin_data = base64.b64decode(data) if data else b''
values = {
'file_size': len(bin_data),
'checksum': self._compute_checksum(bin_data),
'db_datas': value,
# we seriously don't need index content on those fields
'index_content': False,
'index_content': self._index(bin_data, mimetype),
'store_fname': False,
'db_datas': data,
}
fname = attach.store_fname
# write as superuser, as user probably does not
# have write access
super(IrAttachment, attach.sudo()).write(vals)
if fname:
self._file_delete(fname)
continue
super(IrAttachment, attach)._inverse_datas()
return values
return super()._get_datas_related_values(data, mimetype)
@api.model
def _file_read(self, fname, bin_size=False):
@@ -245,7 +250,7 @@ class IrAttachment(models.Model):
with closing(registry.cursor()) as cr:
try:
yield self.env(cr=cr)
except:
except Exception:
cr.rollback()
raise
else:
@@ -307,9 +312,15 @@ class IrAttachment(models.Model):
domain = AND((
normalize_domain(
[('store_fname', '=like', '{}://%'.format(storage))]
[('store_fname', '=like', '{}://%'.format(storage)),
# for res_field, see comment in
# _force_storage_to_object_storage
'|',
('res_field', '=', False),
('res_field', '!=', False),
]
),
normalize_domain(self._save_in_db_domain())
normalize_domain(self._store_in_db_instead_of_object_storage_domain())
))
with self.do_in_new_env(new_cr=new_cr) as new_env: