<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>musdb | museum-digital: blog</title>
	<atom:link href="https://blog.museum-digital.org/category/development/musdb-en-development/feed/" rel="self" type="application/rss+xml" />
	<link>https://blog.museum-digital.org</link>
	<description>A blog on museum-digital and the broader digitization of museum work.</description>
	<lastBuildDate>Mon, 12 Jan 2026 17:19:00 +0000</lastBuildDate>
	<language>en-US</language>
	<sy:updatePeriod>
	hourly	</sy:updatePeriod>
	<sy:updateFrequency>
	1	</sy:updateFrequency>
	

<image>
	<url>https://blog.museum-digital.org/wp-content/uploads/2020/01/cropped-mdlogo-code-512px-32x32.png</url>
	<title>musdb | museum-digital: blog</title>
	<link>https://blog.museum-digital.org</link>
	<width>32</width>
	<height>32</height>
</image> 
<atom:link rel="search" type="application/opensearchdescription+xml" title="Search museum-digital: blog" href="https://blog.museum-digital.org/wp-json/opensearch/1.1/document" />	<item>
		<title>State of Development, December 2025</title>
		<link>https://blog.museum-digital.org/2026/01/12/state-of-development-december-2025/</link>
					<comments>https://blog.museum-digital.org/2026/01/12/state-of-development-december-2025/#respond</comments>
		
		<dc:creator><![CDATA[Joshua Ramon Enslin]]></dc:creator>
		<pubDate>Mon, 12 Jan 2026 17:15:11 +0000</pubDate>
				<category><![CDATA[Development]]></category>
		<category><![CDATA[Frontend]]></category>
		<category><![CDATA[Importer]]></category>
		<category><![CDATA[musdb]]></category>
		<category><![CDATA[nodac]]></category>
		<category><![CDATA[IIIF]]></category>
		<category><![CDATA[Imports]]></category>
		<category><![CDATA[New Features]]></category>
		<category><![CDATA[Object editing (musdb)]]></category>
		<category><![CDATA[Object images]]></category>
		<category><![CDATA[Object search (musdb)]]></category>
		<category><![CDATA[Single image view]]></category>
		<category><![CDATA[System administration]]></category>
		<category><![CDATA[User interface]]></category>
		<guid isPermaLink="false">https://blog.museum-digital.org/?p=4616</guid>

					<description><![CDATA[December 2025 was an interesting month for museum-digital. An update to the PHP version used as well as a flood of requests by what is most likely AI scrapers forced us to make changes for improved stability, reducing and reformulating features rather than adding new ones and working on matters of systems administration over purely <a href="https://blog.museum-digital.org/2026/01/12/state-of-development-december-2025/" class="more-link">...</a>]]></description>
										<content:encoded><![CDATA[
<p>December 2025 was an interesting month for museum-digital. An update to the PHP version used as well as a flood of requests by what is most likely AI scrapers forced us to make changes for improved stability, reducing and reformulating features rather than adding new ones and working on matters of systems administration over purely matters of code quite often. Add to that the long-promised update of the terms of use for German museums to more structured and lawyer-approved ones, and you get yet more small changes that do not directly concern the work of museums with museum-digital but rather improve the necessary context.</p>



<h2 class="wp-block-heading">musdb</h2>



<h3 class="wp-block-heading">Object overview</h3>



<p>In the default tile view of the object overview page, hovering over an object image thus far revealed the object&#8217;s name. As object names are often too long to display fully and inventory numbers are the primary means of identifying an object in most museums, this preview text has now been extended to include the inventory numer.</p>



<figure class="wp-block-image size-large"><img fetchpriority="high" decoding="async" width="1024" height="570" src="https://blog.museum-digital.org/wp-content/uploads/2026/01/20260112_musdb-object-list-1024x570.webp" alt="Screenshot in the object overview." class="wp-image-4613" srcset="https://blog.museum-digital.org/wp-content/uploads/2026/01/20260112_musdb-object-list-1024x570.webp 1024w, https://blog.museum-digital.org/wp-content/uploads/2026/01/20260112_musdb-object-list-300x167.webp 300w, https://blog.museum-digital.org/wp-content/uploads/2026/01/20260112_musdb-object-list-1536x855.webp 1536w, https://blog.museum-digital.org/wp-content/uploads/2026/01/20260112_musdb-object-list-2048x1140.webp 2048w" sizes="(max-width: 1024px) 100vw, 1024px" /><figcaption class="wp-element-caption">Hovering over an object image in the tile view now also displays the inventory number.</figcaption></figure>



<h3 class="wp-block-heading">User management</h3>



<h4 class="wp-block-heading">New Options for Managing User Accounts: Disabling Accounts &amp; Setting Account Expiry Dates</h4>



<p>Two new options on user editing pages allow disabling logins on an account and setting an expiry date for the account. Both can be useful for administration: If a new worker joins the museum for a project with a clear-cut limitation on funding and time, one can now set the account expiry at the beginning of the project to the end of it. The accounts will then automatically be deleted when the project ends. Similarly, colleagues that leave service temporarily but for a prolonged time (e.g. for a sabbatical) and will not need to use their accounts for that time can have their accounts disabled.</p>



<figure class="wp-block-image size-large"><img decoding="async" width="1024" height="398" src="https://blog.museum-digital.org/wp-content/uploads/2026/01/20260112_musdb-user-options-1024x398.webp" alt="Screenshot of the user editing page in musdb." class="wp-image-4611" srcset="https://blog.museum-digital.org/wp-content/uploads/2026/01/20260112_musdb-user-options-1024x398.webp 1024w, https://blog.museum-digital.org/wp-content/uploads/2026/01/20260112_musdb-user-options-300x116.webp 300w, https://blog.museum-digital.org/wp-content/uploads/2026/01/20260112_musdb-user-options-1536x596.webp 1536w, https://blog.museum-digital.org/wp-content/uploads/2026/01/20260112_musdb-user-options-2048x795.webp 2048w" sizes="(max-width: 1024px) 100vw, 1024px" /><figcaption class="wp-element-caption">Two new options allow disabling user accounts and setting expiry dates for user accounts.</figcaption></figure>



<h4 class="wp-block-heading">List of Terms of Use</h4>



<p>A new tab on a user&#8217;s (own) account settings page provides the option to list all usage agreements / terms of use a user has agreed to in the context of their use of museum-digital / musdb and when they agreed to them.</p>



<figure class="wp-block-image size-large"><img decoding="async" width="1024" height="576" src="https://blog.museum-digital.org/wp-content/uploads/2026/01/20260112_musdb-user-agreement-list-1024x576.webp" alt="Screenshot of the user editing page." class="wp-image-4612" srcset="https://blog.museum-digital.org/wp-content/uploads/2026/01/20260112_musdb-user-agreement-list-1024x576.webp 1024w, https://blog.museum-digital.org/wp-content/uploads/2026/01/20260112_musdb-user-agreement-list-300x169.webp 300w, https://blog.museum-digital.org/wp-content/uploads/2026/01/20260112_musdb-user-agreement-list-1536x864.webp 1536w, https://blog.museum-digital.org/wp-content/uploads/2026/01/20260112_musdb-user-agreement-list.webp 1949w" sizes="(max-width: 1024px) 100vw, 1024px" /><figcaption class="wp-element-caption">A new tab on the user page lists all user agreements for musdb that the user has agreed to and when they did so.
<br></figcaption></figure>



<h3 class="wp-block-heading">Imports</h3>



<h4 class="wp-block-heading">Limiting Report Mail Size</h4>



<p>When a user runs imports themselves using the <a href="https://de.handbook.museum-digital.info/import/importe-selbst-durchfuehren.html">WebDAV upload</a>, the end of the import process &#8211; no matter if it is successful or fails &#8211; is marked by the sending of a report via mail. This report usually contains a list of noteworthy operations that happened during the import, e.g. which objects of which inventory number were imported to which object in musdb, identified by its ID. As imports grow, this list of operation grows. To not encounter issues sending the report, it is henceforth limited to a maximum of 2 MB or 10000 lines.</p>



<h4 class="wp-block-heading">Dry-run Mode</h4>



<p>Sometimes it is useful to try running an import to see if it will actually work but not actually process any data. This option has been available in the importer command line interface for a while, among others powering <a href="https://quality.museum-digital.org/">museum-digital:qa</a>. It is now available in the import configuration for self-run imports as well using the setting <code>dry-run</code>. Enabling the setting accordingly stops the importer from actually writing the data into the database and changes the behavior if values that need to be mapped to values in controlled lists at museum-digital are encountered. Usually an import stops the moment such data is to be imported and not yet mapped. During a dry run, the error is collected and the import proceeds. All unmapped entries are listed together at the end of the import, allowing for a simpler mapping (possibly aided by <a href="https://concordance.museum-digital.org/">concordance.museum-digital.org</a>).</p>



<h3 class="wp-block-heading">Dashboard</h3>



<p>The first page of the dashboard, which for almost all users also means the start page of musdb right after the login process, was significantly reworked during the last month. The almost entirely unused notetaking features and discourse integration were removed in favor of a feed of recent blog posts. See also the section <a href="https://blog.museum-digital.org/2025/12/29/trimming/">&#8220;Communications&#8221;</a> in the respective blog post.</p>



<figure class="wp-block-image size-large"><img loading="lazy" decoding="async" width="1024" height="576" src="https://blog.museum-digital.org/wp-content/uploads/2025/12/20251229_screenshot-musdb-1024x576.webp" alt="Screenshot of the dashboard in musdb, as of 2025-12-29." class="wp-image-4594" srcset="https://blog.museum-digital.org/wp-content/uploads/2025/12/20251229_screenshot-musdb-1024x576.webp 1024w, https://blog.museum-digital.org/wp-content/uploads/2025/12/20251229_screenshot-musdb-300x169.webp 300w, https://blog.museum-digital.org/wp-content/uploads/2025/12/20251229_screenshot-musdb-1536x864.webp 1536w, https://blog.museum-digital.org/wp-content/uploads/2025/12/20251229_screenshot-musdb.webp 1920w" sizes="auto, (max-width: 1024px) 100vw, 1024px" /><figcaption class="wp-element-caption">The dashboard in musdb now features a feed of recent news relevant to the development of museum-digital and whatever is going on regionally. The posts are sorted chronologically.</figcaption></figure>



<h3 class="wp-block-heading">Annotations for the Vocabulary Editing Team</h3>



<p>Each event, displayed as a tile on object editing pages, featured speech bubble icons behind each time / actor / place to provide additional comments and hints for the central vocabulary editing team. This positioning of the annotation feature led to confusion over the years, with some users using the feature to comment on the relationship between the entity and the object (for which the event notes should be used). We hence repositioned the links and moved them to the respective entity&#8217;s page (e.g. a place page for giving hints and comments on a place entry). The hinting / commenting feature for times has been altogether removed, as providing comments to clarify the meaning of e.g. a year never made much sense.</p>



<h3 class="wp-block-heading">Smaller Updates and Bugfixes</h3>



<ul class="wp-block-list">
<li>Fixed a bug in the HTML generated for listing other objects linked to an object. Links to the other object were broken and are not anymore.</li>



<li>Image editing pages now embed the image directly instead of using the IIIF API. This reduces resource usage and increases stability at no cost.</li>



<li>Removed option to manually trigger the rewriting of EXIF and IPTC metadata of object images. Rewriting takes place in the background whenever an image or a linked object is updated, making user-triggered updates obsolete.</li>



<li>Re-introduce option to repeat linking to the last used linked object</li>



<li>Updated <a href="https://swagger.io/">Swagger UI</a> to version 5.30.3</li>
</ul>



<h2 class="wp-block-heading">Frontend</h2>



<p>As stated above and lengthily described in the previous blog posts (<a href="https://blog.museum-digital.org/2025/12/09/updates-ai-scrapers-and-resilience/">here</a>, <a href="https://blog.museum-digital.org/2025/12/22/cleaning-out-our-closet/">here</a>, and <a href="https://blog.museum-digital.org/2025/12/29/trimming/">here</a>) we struggled with stability over the last month. This means that most changes in the frontend are aimed at improving stability.</p>



<h3 class="wp-block-heading">Reworked Default Image Page</h3>



<p>Thoroughly described in <a href="https://blog.museum-digital.org/2025/12/09/updates-ai-scrapers-and-resilience/">Updates, AI scrapers, and Resilience</a>, we replaced the default view for single object image pages. While the default view was previously built around the IIIF viewer Mirador, the new default view uses OpenLayers and the unmediated image file for capabilities such as zooming. The new view also brings with it some new features, such as an option to reference specific sections of an image.</p>



<figure class="wp-block-image size-large"><img loading="lazy" decoding="async" width="1024" height="672" src="https://blog.museum-digital.org/wp-content/uploads/2026/01/20260112_frontend-image-page-1024x672.webp" alt="" class="wp-image-4615" srcset="https://blog.museum-digital.org/wp-content/uploads/2026/01/20260112_frontend-image-page-1024x672.webp 1024w, https://blog.museum-digital.org/wp-content/uploads/2026/01/20260112_frontend-image-page-300x197.webp 300w, https://blog.museum-digital.org/wp-content/uploads/2026/01/20260112_frontend-image-page-1536x1007.webp 1536w, https://blog.museum-digital.org/wp-content/uploads/2026/01/20260112_frontend-image-page-2048x1343.webp 2048w" sizes="auto, (max-width: 1024px) 100vw, 1024px" /><figcaption class="wp-element-caption">The reworked default image page.</figcaption></figure>



<h3 class="wp-block-heading">Serving Resource-Intensive Pages / Functionalities Only When Resources Are Available</h3>



<p>PDF generation, the IIIF Image API, and the suggestions for alternative search queries on failed search pages are now limited to reduce their impact on the overall system stability. This follows two strategies:</p>



<ul class="wp-block-list">
<li>Suggestions on failed search pages and PDF generation will only appear if the overall load on the system is low. The threshold for when or when they are not provided is influenced by the user&#8217;s browser language: If a user uses a browser set to the primary language of a given instance of museum-digital (e.g. German in Hesse, Hungarian in Budapest), the threshold is much higher, meaning users will be able to access the pages at a medium server load. In the case of PDFs, high server load will forward users to the print dialogue for object pages instead of receiving a PDF generated on the server side.</li>



<li>PDF generation and the IIIF Image API are served with a different PHP configuration and set of processes than the rest of the frontend. This configuration significantly reduces available resources for these two functionalities.</li>



<li>The option to generate PDFs featuring all images of an object with between 10 and 40 images has been entirely removed. Given its constraints, the feature was hard to explain and rarely accessible anyway. The primary &#8220;users&#8221; were noticeably AI scrapers.</li>
</ul>



<h3 class="wp-block-heading">Image Search</h3>



<p>The image search feature was refactored and reduced to further separate it from the primary object search. The number of available search options has been reduced to be more easily explainable and reduce possibilities for very resource-intensive queries.</p>



<figure class="wp-block-image size-large"><img loading="lazy" decoding="async" width="1024" height="602" src="https://blog.museum-digital.org/wp-content/uploads/2026/01/20260112_frontend-image-search-1024x602.webp" alt="" class="wp-image-4614" srcset="https://blog.museum-digital.org/wp-content/uploads/2026/01/20260112_frontend-image-search-1024x602.webp 1024w, https://blog.museum-digital.org/wp-content/uploads/2026/01/20260112_frontend-image-search-300x176.webp 300w, https://blog.museum-digital.org/wp-content/uploads/2026/01/20260112_frontend-image-search-1536x903.webp 1536w, https://blog.museum-digital.org/wp-content/uploads/2026/01/20260112_frontend-image-search-2048x1204.webp 2048w" sizes="auto, (max-width: 1024px) 100vw, 1024px" /><figcaption class="wp-element-caption">The reworked image search settings overlay.</figcaption></figure>



<h3 class="wp-block-heading">Batch Export of Object Metadata / OAI</h3>



<p>Updated the LIDO API to almost entirely match the LIDO as generated during exports from musdb</p>



<h3 class="wp-block-heading">Smaller Updates and Bugfixes</h3>



<ul class="wp-block-list">
<li>Improved performance of object search by tags and places by filtering searched entities to those who are actually linked in the given instance of museum-digital.</li>



<li>Object groups with only one object are henceforth not displayed and linked on object pages anymore</li>



<li>Fixed link in footer: Clicking on &#8220;museum-digital&#8221; should lead to the home / start page of the given instance of musdb.</li>



<li>Updated <a href="https://swagger.io/">Swagger UI</a> to version 5.30.3</li>
</ul>



<h2 class="wp-block-heading">nodac</h2>



<ul class="wp-block-list">
<li>User-provided comments / hints have been removed for times (see above)</li>



<li>Tooltips for linked objects now display which institution an object belongs to
<ul class="wp-block-list">
<li>This is particularly important for vocabulary editors who do not have access to the museums&#8217; data. This way they get a limited preview with the required information for unpublished objects despite their otherwise lacking permissions.</li>
</ul>
</li>
</ul>



<div class="wp-block-cgb-cc-by message-body" style="background-color:white;color:black"><img loading="lazy" decoding="async" src="https://blog.museum-digital.org/wp-content/plugins/creative-commons/includes/images/by.png" alt="CC" width="88" height="31"/><p><span class="cc-cgb-name">This content</span> is licensed under a <a href="https://creativecommons.org/licenses/by/4.0/">Creative Commons Attribution 4.0 International license.</a> <span class="cc-cgb-text"></span></p></div>
]]></content:encoded>
					
					<wfw:commentRss>https://blog.museum-digital.org/2026/01/12/state-of-development-december-2025/feed/</wfw:commentRss>
			<slash:comments>0</slash:comments>
		
		
		<post-thumbnail><url>https://blog.museum-digital.org/wp-content/uploads/2026/01/20260112_News-Img-1.webp</url><width>600</width><height>467</height></post-thumbnail>	</item>
		<item>
		<title>Trimming.</title>
		<link>https://blog.museum-digital.org/2025/12/29/trimming/</link>
					<comments>https://blog.museum-digital.org/2025/12/29/trimming/#respond</comments>
		
		<dc:creator><![CDATA[Joshua Ramon Enslin]]></dc:creator>
		<pubDate>Mon, 29 Dec 2025 01:10:16 +0000</pubDate>
				<category><![CDATA[Development]]></category>
		<category><![CDATA[Frontend]]></category>
		<category><![CDATA[Infrastructure]]></category>
		<category><![CDATA[musdb]]></category>
		<category><![CDATA[Feature Removal]]></category>
		<category><![CDATA[New Features]]></category>
		<category><![CDATA[System administration]]></category>
		<guid isPermaLink="false">https://blog.museum-digital.org/?p=4592</guid>

					<description><![CDATA[The recent issues with server instability have been solved. To do so, we had to significantly reduce resources available to the IIIF API. And in learning from the whole situation, a feed of the most recent relevant blog posts are now displayed to users directly in musdb.]]></description>
										<content:encoded><![CDATA[
<p>In the last weeks we struggled with server stability. As <a href="https://blog.museum-digital.org/2025/12/09/updates-ai-scrapers-and-resilience/">written before</a>, the critical, resource-heavy and publicly available tasks have for a long time been the generation of timelines (and thus complicated search queries) and on the other hand those involving the processing or generation of large files; namely the <a href="https://iiif.io/">IIIF</a> API and PDF generation.</p>



<p>In the <a href="https://blog.museum-digital.org/2025/12/22/cleaning-out-our-closet/">last post</a>, I detailed how we severely restricted the availability of the public PDF generation functionalities in museum-digital according to available system resources. That, as it turned out, was not enough to bring reliable stability to our systems. After the server fell over on December 26th once more, we hence moved the IIIF Image API into the same PHP setup used for PDF generation &#8211; meaning that any user/IP can only request the API 10 times a minute and that for any instance of museum-digital, only one PHP worker serves it. This allowed us to severely reduce the maximum available resources per worker for the frontend outside of those two use cases (where the IIIF Image API may use up to 80 MB of RAM, no other part of the frontend will go beyond 5). Since then, the system runs as smoothly as if AI scraping had never become an issue.</p>



<h2 class="wp-block-heading">A Limited Goodbye to IIIF &amp; Server-Side Image Manipulation</h2>



<p>Now, what does that mean in practice? On the one hand, we have not fully removed the IIIF image API. All links generated using it remain valid and will be served, even if comparatively slowly.</p>



<p>On the other hand the user experience with viewing the images in a IIIF viewer will be significantly worse, even though this strongly depends on the IIIF viewer. The most popular &#8220;full&#8221; IIIF viewers being <a href="https://projectmirador.org/">Mirador</a> and <a href="https://universalviewer.io/">Universal Viewer</a>, significant problems (or a complete inability to use an object&#8217;s images) are to be expected with Mirador. Mirador in its default configuration loads multiple segments of an image separately to then assemble the displayed image from those &#8211; with the creation of the segments happening on the server, thus consuming resources centrally. It also seems to set extremely low limits on accepted response times, which museum-digital&#8217;s IIIF Image API now regularly exceeds due to the aggressive rate limiting. Simply looking at the demo installation of Universal Viewer, the software seems to be much more targeted in its API calls and might still work well despite the restrictions.</p>



<p>As far as I know, there are no published numbers on the market share of the different IIIF image viewers. And about whether IIIF viewers external to whoever provides the API are actually regularly used or not. The most jaded &#8211; and likely true &#8211; assumption would be that the share of users who use IIIF without a viewer hosted next to the API is miniscule and that most users will use one of the abovementioned. Our experience, once again, seems to support that hypothesis: We released our implementation of IIIF 2 in 2020, but essentially nobody noticed before we also started hosting a IIIF viewer.</p>



<p>As we do use Mirador as a viewer, assume the &#8220;visible&#8221; IIIF image API at museum-digital to be more or less broken. Developers and those making direct use of the API without our installation of Mirador can still benefit from the API. But those are comparatively few.</p>



<p>The radical restriction of resources provided to the IIIF Image API is thus likely indeed a goodbye to IIIF, if a limited one. The basic idea is great &#8211; to create a unified way to reference parts of an image (or later a wider media file) and annotate it. In times of significantly increased bot activity, reduced funds, and foreseeably rising hosting costs, our example may be an early sign that the decision to realize that aim by specifying an API to be implemented by the data providers restricts the ability to fully support IIIF to very well resourced institutions. And as funds are shrinking, that is less and less institutions. Let&#8217;s hope that the most basic need IIIF wished to fulfill can be achieved in a different way in the future; one that is accessible to anybody. Realistically this means that computing would need to happen on the client PCs, not on a server.</p>



<p>To end the saga on a more positive note: Since we limited the IIIF Image API, our systems run wonderfully smoothly again and we were able to reduce the overall rate limiting on the rest of museum-digital&#8217;s portals. We will monitor the situation and increase the limit slowly to allow more simultaneous API requests without risking stability.</p>



<h2 class="wp-block-heading">Communications</h2>



<p>Second, the whole ordeal posed a challenge to our communication channels. If any significant error occurs anywhere on museum-digital, I personally am sent an encrypted error message via mail. Usually. In this case, the primary component falling over was the PHP server, which is also responsible for managing the sending of mails. If a service fell over, the primary way to learn of it was receiving mails about that instead. Reaction times were thus worse than they needed to be. This means that we need to improve our monitoring.</p>



<p>On the other hand there was the issue of explaining what was going on. We had a thread about it in the <a href="https://forum.museum-digital.info/d/69-uploading-images-in-musdb-are-slow-and-buggy">forum</a>, which few people read. We had the blog posts. Which few people read. We lack (or lacked) a unified source of information about current events that we can assume people to read. The blog could and should be exactly that.</p>



<p>At the top right of the login screen of musdb, the two most recent blog posts from the respective region as well as from the &#8220;development&#8221; category of the blog have been shown for years. Then we turned on the &#8220;remember me&#8221; feature by default, which means that people only very rarely see the login page at all anymore.</p>



<p>The first page most users see upon logging in or opening musdb while logged in is the dashboard, the default subsection of which previously offered a summary of the database contents a user has access to, a tile for writing personal notes to oneself, a tile with messages from the respective regional administrators, a tile for the integration of a <a href="https://www.discourse.org">discourse</a> forum, and links to the museum elsewhere on the web.</p>



<p>The summary of database contents and the links to the museum elsewhere are certainly useful. The other features not so much. Checking their actual use revealed that barely anybody used any of the note-taking features (likely also because musdb itself offers better alternatives elsewhere), while the discourse integration has not been in use for years. The very first features one sees when opening musdb were thus largely unused, wasting space that could be filled with a feed of relevant blog entries.</p>



<p>And so we removed the unused features and replaced them with a more prettily designed feed. This feed now contains the two newest blog posts from the development feed in the user&#8217;s language, the regional or national feed (again in the user&#8217;s language) as well as &#8211; importantly &#8211; the English-language development feed. None of the most recent development-related posts were translated to any language other than the original English, mainly because the time was better spent trying to alleviate or fix the issues than describing them in yet another language. Besides, most people know enough English to grasp the posts. And for those who do not: Community contributions to the blog &#8211; also translations for those who do not &#8211; are always welcome.</p>



<figure class="wp-block-image size-large"><img loading="lazy" decoding="async" width="1024" height="576" src="https://blog.museum-digital.org/wp-content/uploads/2025/12/20251229_screenshot-musdb-1024x576.webp" alt="Screenshot of the dashboard in musdb, as of 2025-12-29." class="wp-image-4594" srcset="https://blog.museum-digital.org/wp-content/uploads/2025/12/20251229_screenshot-musdb-1024x576.webp 1024w, https://blog.museum-digital.org/wp-content/uploads/2025/12/20251229_screenshot-musdb-300x169.webp 300w, https://blog.museum-digital.org/wp-content/uploads/2025/12/20251229_screenshot-musdb-1536x864.webp 1536w, https://blog.museum-digital.org/wp-content/uploads/2025/12/20251229_screenshot-musdb.webp 1920w" sizes="auto, (max-width: 1024px) 100vw, 1024px" /><figcaption class="wp-element-caption">The dashboard in musdb now features a feed of recent news relevant to the development of museum-digital and whatever is going on regionally. The posts are sorted chronologically.</figcaption></figure>



<div class="wp-block-cgb-cc-by message-body" style="background-color:white;color:black"><img loading="lazy" decoding="async" src="https://blog.museum-digital.org/wp-content/plugins/creative-commons/includes/images/by.png" alt="CC" width="88" height="31"/><p><span class="cc-cgb-name">This content</span> is licensed under a <a href="https://creativecommons.org/licenses/by/4.0/">Creative Commons Attribution 4.0 International license.</a> <span class="cc-cgb-text"></span></p></div>
]]></content:encoded>
					
					<wfw:commentRss>https://blog.museum-digital.org/2025/12/29/trimming/feed/</wfw:commentRss>
			<slash:comments>0</slash:comments>
		
		
		<post-thumbnail><url>https://blog.museum-digital.org/wp-content/uploads/2025/12/20251222-blog-post-alte-zoepfe-scaled.webp</url><width>600</width><height>411</height></post-thumbnail>	</item>
		<item>
		<title>Updates, AI scrapers, and Resilience</title>
		<link>https://blog.museum-digital.org/2025/12/09/updates-ai-scrapers-and-resilience/</link>
					<comments>https://blog.museum-digital.org/2025/12/09/updates-ai-scrapers-and-resilience/#respond</comments>
		
		<dc:creator><![CDATA[Joshua Ramon Enslin]]></dc:creator>
		<pubDate>Tue, 09 Dec 2025 00:11:30 +0000</pubDate>
				<category><![CDATA[Development]]></category>
		<category><![CDATA[Frontend]]></category>
		<category><![CDATA[Infrastructure]]></category>
		<category><![CDATA[musdb]]></category>
		<category><![CDATA[New Features]]></category>
		<category><![CDATA[System administration]]></category>
		<guid isPermaLink="false">https://blog.museum-digital.org/?p=4580</guid>

					<description><![CDATA[Between Thursday last week (November 27th) and yesterday (December 6th), museum-digital has seen its most instable week in about four years. Now that the dust has settled a bit, there's finally some time to discuss what happened and how we managed to tackle the multiple issues leading to the (very noticeable) instability.]]></description>
										<content:encoded><![CDATA[
<p>Between Thursday last week (November 27th) and yesterday (December 6th), museum-digital has seen its most instable week in about four years. Now that the dust has settled a bit, there&#8217;s finally some time to discuss what happened and how we managed to tackle the multiple issues leading to the (very noticeable) instability.</p>



<h2 class="wp-block-heading">Background</h2>



<h3 class="wp-block-heading">Scrapers</h3>



<p>There were (or are) two factors simultaneously pushing our servers to their limits and requiring changes. On the one hand, scraping of museum-digital has gotten even more aggressive. Where we usually has something around 10-30 requests per second across all of museum-digital a year ago, we had around 300 two weeks ago. Right now it&#8217;s often between 500 and 700. This number excludes any access to static files.</p>



<p>As I&#8217;ve written elsewhere, the scrapers are mostly noticable by coming from IP ranges in Asia or (to a lesser extent) the US. On the other hand, the IPs change constantly and user-agents etc. resemble regular users. Likely they simply use an actual chrome browser for scraping. Which is to say, attempting to block them is futile. Worse yet, attempts to block scrapers would likely also impact some real users.</p>



<p>Fortunately museum-digital is run on dedicates servers paid by time rather than by compute. The onslaught of scrapers thus has no financial impact on us. But the scrapers still use resources, and as they try to scrape as many different pages as possible, it is much harder to optimize for them than it is to optimize for actual human users (see this article on a similar issue at <a href="https://arstechnica.com/information-technology/2025/04/ai-bots-strain-wikimedia-as-bandwidth-surges-50/">Wikimedia</a>).</p>



<p>Either way, AI scrapers can result in improvements. Viewed positively, they essentially act as a free stress test on a service and enforce efficiency in all aspects. If most pages are optimized for performance already, scrapers will find the unoptimized ones and bring down a service by overusing those. Which is to say, they help to identify yet unoptimized scripts/pages/classes and enforce that necessary changes are made. At museum-digital, there are three main weak spots that are hard to optimize: timelines, image manipulation (including the IIIF API), and PDF generation.</p>



<h3 class="wp-block-heading">PHP</h3>



<p>On November 20th PHP 8.5 was released. Thus far, museum-digital had been running on PHP 8.3 for web hosting and PHP 8.4 on the command line. When we attempted to update to 8.4 last year, the server fell over. This was mainly caused by the IIIF API (and thus, image manipulation via <a href="https://www.libvips.org/">libvips</a>).</p>



<p>Dependencies at museum-digital are (like pretty much universal with PHP) handled using the package manager <code>composer</code>. Setting up a new instance of museum-digital, composer (managed on version 8.4) required PHP 8.4 or later to run &#8211; the new instance was thus unable, being stuck on version 8.3 for hosting.</p>



<p>That leaves two options: Either to set up composer using PHP 8.3 again, or to simply update everything to the current version. While PHP 8.3 will be <a href="https://www.php.net/supported-versions.php">supported until 2027</a>, it is generally advisable to update when possible. So updating it was.</p>



<p>Importantly, PHP at museum-digital is run via <a href="https://www.php.net/manual/de/install.fpm.php">PHP-FPM</a>. Before the update, we had one socket running per subdomain. This means, that if a PHP process serving the frontend stopped working for any reason, users in musdb were impacted as well.</p>



<h2 class="wp-block-heading">Upgrading PHP to version 8.5</h2>



<p>Once we upgraded the PHP version to 8.5 on Thursday, the same problems we faced with PHP 8.4 appeared again. The server would run rather smoothly for some hours, then more and more PHP processes would die and PHP-FPM would fall over for a given subdomain, and users would get a 504 gateway timeout error. Again, the IIIF API and image manipulation were the main causes of PHP-FPM getting stuck. Of course, the number of AI scrappers continuing to use the site did not help.</p>



<h3 class="wp-block-heading">PHP-FPM settings</h3>



<p>A natural first point to consider was the configuration of PHP-FPM. PHP-FPM knows three basic modes for running an application:</p>



<ul class="wp-block-list">
<li><code>ondemand</code> You define a maximum number of processes the application may use. When a new request is made, idle processes get used. If there is no idle process, PHP-FPM starts a new one. After a specified number of requests or a given number of seconds, an old process is closed. This is primarily aimed at being able to scale way down &#8211; if there is no requests, there will be no processes (which is to say, less resources used). On the other hand, starting new processes takes time.</li>



<li><code>static</code> You define a number of processes that should always be running for the application. This means that there should always be processes already started and ready for usage, but it also means that those processes take up resources even when they are little used. Which is to say, this is useful if one has a high and constant stream of users.</li>



<li><code>dynamic</code> You define a maximum number of processes, as well as how many processes should be always running for immediate use, and a (minimum and maximum) number of spare processes to keep running. PHP-FPM then manages if more processes should be started or if one of the already running ones shall be used. This, in theory, is useful if one wants to reliably and quickly serve users, expects some use all the time, but wants the server to dynamically scale up and down as needed.</li>
</ul>



<p>With museum-digital spread out over around 80 subdomains, we had thus far used the <code>ondemand</code> mode for most subdomains. Only the largest and most used instances / subdomains of museum-digital were run using <code>dynamic</code> mode. With the update to PHP 8.4 and then 8.5, the behavior of the <code>ondemand</code> mode seems to have changed. If one process dies, the whole subdomain goes seems to go down with it (I have not found a documentation on this, but it&#8217;s evident from the last two weeks).</p>



<p>We hence moved critical subdomains impacted by the errors (which is to say, any &#8220;regular&#8221; instance of museum-digital) to dynamic mode. As dynamic mode enforces stricter limits on how many processes can be run respective to the available hardware (which is to say, dynamic mode requires a better-written configuration), this also meant that we needed to adjust the specified numbers of processes per subdomain according to their use.</p>



<p>To actually grasp <em>real</em> use of a subdomain including bots, we turned to the logs we keep for about a week (and then rotate out). In server logs, usually one line corresponds to a single request. With a small script, we loop all the different subdomains and check how many requests were made. To be really sure that only requests to relevant PHP scripts are processed, we filter them by the presence of the substring &#8220;php&#8221; before counting. The result for today between 1 a.m. and 4 p.m. looks as follows:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="raw" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">| Requests count in instance                         |      Total |      musdb |        PDF | 
| -----                                              |      ----- |      ----- |      ----- | 
| agrargeschichte.museum-digital.de                  |     341508 |       1245 |        719 | 
| bawue.museum-digital.de                            |     454228 |      12559 |       6819 | 
| bayern.museum-digital.de                           |     176291 |          0 |        158 | 
| berlin.museum-digital.de                           |     223280 |      14917 |       6814 | 
| brandenburg.museum-digital.de                      |      63286 |       6927 |       3873 | 
| bremen.museum-digital.de                           |     221208 |          0 |       2026 | 
| bund.museum-digital.de                             |        261 |        167 |          5 | 
| collectors.museum-digital.de                       |     108398 |        449 |        648 | 
| hamburg.museum-digital.de                          |      35489 |          0 |         11 | 
| hessen.museum-digital.de                           |      50932 |       7962 |       2486 | 
| meckpomm.museum-digital.de                         |      94177 |         11 |        139 | 
| nds.museum-digital.de                              |     137703 |       4105 |       4134 | 
| owl.museum-digital.de                              |     427667 |       1258 |       2412 | 
| rheinland.museum-digital.de                        |      64838 |       1753 |       1276 | 
| rlp.museum-digital.de                              |     207944 |       7405 |       7532 | 
| sachsen.museum-digital.de                          |     120931 |      16117 |       6034 | 
| saarland.museum-digital.de                         |        210 |          0 |          1 | 
| smb.museum-digital.de                              |     228542 |          0 |      11517 | 
| sh.museum-digital.de                               |      21098 |          0 |         48 | 
| st.museum-digital.de                               |     317913 |       6243 |       6217 | 
| thue.museum-digital.de                             |     117893 |          0 |        495 | 
| westfalen.museum-digital.de                        |     101584 |       2033 |       3310 | 
| br.museum-digital.org                              |      43413 |          0 |         16 | 
| jateng.id.museum-digital.org                       |        211 |          0 |          0 | 
| jatim.id.museum-digital.org                        |      23410 |          0 |        159 | 
| lazio.it.museum-digital.org                        |        295 |          0 |          0 | 
| ma.pl.museum-digital.org                           |        385 |          0 |          0 | 
| noe.at.museum-digital.org                          |     906386 |          0 |        369 | 
| tirol.at.museum-digital.org                        |        537 |          0 |          7 | 
| vbg.at.museum-digital.org                          |         96 |          0 |          0 | 
| wien.at.museum-digital.org                         |     472305 |        586 |       3243 | 
| ulster.ie.museum-digital.org                       |      28869 |          0 |          2 | 
| connacht.ie.museum-digital.org                     |        392 |          0 |          0 | 
| va.srb.museum-digital.org                          |       5599 |          0 |         22 | 
| ko.rou.museum-digital.org                          |       9036 |        635 |        567 | 
| mm.rou.museum-digital.org                          |        235 |          0 |          0 | 
| ca.usa.museum-digital.org                          |       3946 |          0 |          0 | 
| ma.usa.museum-digital.org                          |        357 |          0 |          0 | 
| ny.usa.museum-digital.org                          |      19576 |          0 |        294 | 
| syddanmark.dk.museum-digital.org                   |        675 |          0 |          9 | 
| de.pt.museum-digital.org                           |       1241 |          0 |         29 | 
| zh.ch.museum-digital.org                           |     233280 |        512 |        650 | 
| ba.hu.museum-digital.org                           |      99927 |       1901 |         72 | 
| be.hu.museum-digital.org                           |     100830 |        244 |       3005 | 
| bk.hu.museum-digital.org                           |     489446 |         55 |       3985 | 
| bu.hu.museum-digital.org                           |     213616 |       6206 |       5753 | 
| bz.hu.museum-digital.org                           |     598550 |        680 |       1788 | 
| cs.hu.museum-digital.org                           |      88585 |          0 |       1054 | 
| fe.hu.museum-digital.org                           |     199812 |          7 |        215 | 
| gs.hu.museum-digital.org                           |     216680 |       4215 |        912 | 
| hb.hu.museum-digital.org                           |      61250 |          0 |         65 | 
| he.hu.museum-digital.org                           |      26312 |          7 |         26 | 
| jn.hu.museum-digital.org                           |      11970 |          0 |        131 | 
| ke.hu.museum-digital.org                           |     370219 |       2959 |       1680 | 
| no.hu.museum-digital.org                           |     119487 |          0 |       1545 | 
| pe.hu.museum-digital.org                           |     603846 |       2957 |       1446 | 
| so.hu.museum-digital.org                           |     308116 |       6151 |       6698 | 
| sz.hu.museum-digital.org                           |        116 |          0 |          0 | 
| to.hu.museum-digital.org                           |      52406 |          0 |       1229 | 
| va.hu.museum-digital.org                           |     184231 |       2839 |       1666 | 
| ve.hu.museum-digital.org                           |    1015509 |       3672 |        296 | 
| za.hu.museum-digital.org                           |        199 |          0 |          6 | 
| ce.cz.museum-digital.org                           |          3 |          0 |          0 | 
| ccc.cz.museum-digital.org                          |         17 |          0 |          0 | 
| academia.hu.museum-digital.org                     |       9158 |          0 |         13 | 
| cherkasy.ua.museum-digital.org                     |      25567 |          0 |         26 | 
| chernihiv.ua.museum-digital.org                    |       3258 |         99 |        156 | 
| dnipro.ua.museum-digital.org                       |      26725 |          0 |        109 | 
| donetsk.ua.museum-digital.org                      |         17 |          0 |          0 | 
| ivfr.ua.museum-digital.org                         |        722 |          0 |          9 | 
| kharkiv.ua.museum-digital.org                      |      12932 |          0 |         39 | 
| kyiv.ua.museum-digital.org                         |     436482 |       5967 |       1351 | 
| kyivska.ua.museum-digital.org                      |       2159 |          0 |         79 | 
| lviv.ua.museum-digital.org                         |     163358 |        188 |        274 | 
| poltava.ua.museum-digital.org                      |       7657 |        284 |          3 | 
| odesa.ua.museum-digital.org                        |         93 |          0 |          1 | 
| rivne.ua.museum-digital.org                        |      59510 |         65 |        156 | 
| sumy.ua.museum-digital.org                         |      35890 |        303 |          3 | 
| ternopil.ua.museum-digital.org                     |     150700 |         37 |        184 | 
| zhytomyr.ua.museum-digital.org                     |          3 |          0 |          0 | 
| vinnytsia.ua.museum-digital.org                    |      14229 |          0 |          0 | 
| volyn.ua.museum-digital.org                        |      16705 |          0 |        485 | 
| zakarpattia.ua.museum-digital.org                  |       2865 |          0 |         30 | 
| zaporizhzhia.ua.museum-digital.org                 |      24348 |        338 |         56 | 
| scotland.museum-digital.org                        |          0 |          0 |          0 | 
| md.museum-digital.org                              |          0 |          0 |          0 | 
| demo.museum-digital.org                            |         12 |          2 |          0 | 
| goethehaus.museum-digital.de                       |     260072 |          0 |         85 | 
| lmw.museum-digital.de                              |     326724 |          0 |         65 | 
| gedenkstaetten.museum-digital.de                   |       3474 |          0 |          0 | 
| turcica.museum-digital.de                          |      75533 |          0 |          1 | 
| nat.museum-digital.de                              |    1238860 |          0 |       4657 | 
| at.museum-digital.org                              |     631578 |          0 |         89 | 
| cz.museum-digital.org                              |          2 |          0 |          0 | 
| dk.museum-digital.org                              |       5415 |          0 |          4 | 
| hu.museum-digital.org                              |     359619 |          0 |       2827 | 
| id.museum-digital.org                              |       8030 |          0 |          0 | 
| ie.museum-digital.org                              |       2073 |          0 |          0 | 
| it.museum-digital.org                              |         78 |          0 |          0 | 
| rou.museum-digital.org                             |       8277 |          0 |        466 | 
| pl.museum-digital.org                              |        142 |          0 |          0 | 
| pt.museum-digital.org                              |          0 |          0 |          0 | 
| srb.museum-digital.org                             |        565 |          0 |          0 | 
| ua.museum-digital.org                              |     232115 |          0 |        805 | 
| usa.museum-digital.org                             |       3752 |          0 |         34 | 
| ch.museum-digital.org                              |      53417 |          0 |          1 | 
| global.museum-digital.org                          |     727690 |          0 |       2199 |</pre>



<p>Note that the number of requests obviously is also impacted by bots changing attention &#8211; once a scraper is done with one subdomain, they turn to the next. The elevated number of requests in ve.hu.museum-digital.org is normal, but still starkly exaggerated when compared to other days. The Germany-wide instance is persistently the most frequented one, usually the global one is second at around 80% of requests.</p>



<p>Now equipped with actual numbers, we could scale the PHP-FPM to a much more suitable configuration than before (we had thus far never bothered counting actual requests, instead relying on the number of objects).</p>



<p>A second step in the PHP-FPM configuration was to reduce the impact the problems had. Previously there was one shared configuration and socket per subdomain. On the one hand, this meant that stuck processes in the frontend impacted users in musdb (and vice-versa). On the other hand, some constraints on resource usage cannot be set on a per-directory level but must be set per PHP-FPM socket / server (see the PHP documentation on <a href="https://www.php.net/manual/en/configuration.file.per-user.php">user.ini</a> and the list of <a href="https://www.php.net/manual/en/ini.list.php">php.ini directives</a>). As the frontend and musdb have different requirements (frontend: low maximum memory use, short timeouts, no file uploads, generally strict settings; musdb: long timeouts for uploads, generally more lenient), being able to configure them independent of each other is useful in general.</p>



<p>We thus separated the configuration for the frontend, musdb, and PDF generation in the frontend; providing dedicated sockets for each. The frontend has a reduced <a href="https://de.wikipedia.org/wiki/Nice_(Unix)">priority</a> on the system overall, strict constraints on how it may be used, etc. The settings are stricter than they were before. musdb has an elevated priority and more lenient settings (file uploads, longer timeouts), in fact more lenient than before. Finally, PDF generation is a special case as it offers no real benefit over the browser&#8217;s print tool (see MDN on <a href="https://developer.mozilla.org/en-US/docs/Web/CSS/Guides/Media_queries/Printing">print CSS</a>), while being resource-intensive. As such, it has a far reduced priority and very strict settings.</p>



<p>With the separated configuration and sockets, we can now better tailor the configuration to each application&#8217;s needs and have the added benefit of problems in one application not impacting the other.</p>



<h3 class="wp-block-heading">Code</h3>



<p>As we had already prepared the codebase for PHP 8.4 awaiting an eventual upgrade, the upgrade to PHP 8.5 only required minimal changes. Aside from the deprecation of the functions <code>finfo_close()</code> and <code>curl_close()</code>, references to which were accordingly removed from the code, the update necessitated no further work.</p>



<h2 class="wp-block-heading">Scaling in Software</h2>



<p>Improving the PHP configuration was not enough to fix the issues, especially with the now increased number of requests from bots. To get some breathing room, we adjusted the most resource-intensive pages.</p>



<h3 class="wp-block-heading">Frontend</h3>



<p>In the frontend these are, again, the IIIF API, PDF generation, and timelines. Finally, we made changes to the pages for failed searches to better handle high load situations.</p>



<h4 class="wp-block-heading">Image pages</h4>



<p>The IIIF API was used for the main image pages in the frontend. We used (and use) <a href="https://projectmirador.org/">Mirador</a> as a IIIF viewer. Simply opening an image page thus meant three requests to fetch different regions of an image. Zooming into the image triggered further requests to fetch the relevant parts of the image. Cropping the image to the requested region with IIIF happens on the server (which is no problem if there are few users, but is turning into a problem when you have hundreds of requests per second).</p>



<p>We thus changed the default of image pages: The new default image page is the old, non-IIIF one. As features like zooming into images, that Mirador comes with, are popular and useful and the old image page did not support those, we worked to improve the page. To do so, we rely on <a href="https://openlayers.org/">OpenLayers</a>, a library we already use for maps. Besides including maps from tile servers, OpenLayers also supports loading simple image files &#8211; which we do here. The image is hence loaded once in full size and zooming etc. happen entirely in the browser.</p>



<p>Taking the opportunity, we improved the page overall. An often noticed problem of image pages thus far was, that users who opened image pages coming from external services (think Google Images) had problems identifying that the image was an object image and that there is further object data to be found on object pages. The updated image pages now come with a header stating reflecting the name of the image, the name of the object and the name of the institution. Note that many images do not feature a dedicated title, musdb uses the object name as a default image title in that case, which is why the object title will often appear twice in the header. Maybe this can be used as an encouragement for the colleagues working in musdb to more consistently set expressive image titles in the future.</p>



<p>Also new is a mini map at the bottom left, displaying where in the wider context of the image one has currently zoomed in, as well as the ability to link exactly the region one has currently zoomed into. To enable the latter, the URL updates as one zooms or navigates around the image. Somebody else opening the same URL will then open exactly the same image region the linking person was viewing when copying the URL. Finally, we finally set specific <a href="https://developer.mozilla.org/en-US/docs/Web/HTTP/Guides/CSP">Content Security Policies</a> relevant to the currently opened media. If the displayed media entry is an internally stored image, no external images need to be allowed to load. If the displayed media entry is an audio file stored on archive.org, archive.org needs to be whitelisted as a source for audio files &#8211; but only archive.org and no other page. Previously, embedding images from anywhere on the net was allowed, increasing the potential damage a potential attacker may cause.</p>



<p>Making the use of Mirador a secondary, non-default option reduced the need for server-side image manipulation and the corresponding resource use significantly. The IIIF remains largely unchanged, but its use must now be requested explicitly.</p>



<h4 class="wp-block-heading">PDF generation</h4>



<p>As stated above, PDF generation brings little advantages to the browser&#8217;s print functionality in combination with object pages. On the contrary, the PDFs generated using the frontend&#8217;s templates feature less information. But they come with the file ending &#8220;.pdf&#8221; and seem to be extremely popular with bots. On the other hand, PDF generation means, among others, loading whatever images are to be embedded into the PDF and manipulating them fit into the PDF. The resulting files are significantly larger than the corresponding HTML files and thus also use more of the available bandwidth.</p>



<p>The update to handle PDF generation respective to resource usage was already introduced in the last months: publicly linked PDFs are now only generated if overall load on the server is low, if a user has set their <a href="https://developer.mozilla.org/en-US/docs/Web/HTTP/Reference/Headers/Accept-Language">browser language</a> to any language different from a museum-digital instance&#8217;s default language. As most scrapers do not bother to change their browser language (which means they come with either none, English or Chinese), this means they will mostly be unable to trigger the generation of PDFs. They see an error page instead.</p>



<h4 class="wp-block-heading">Failed Search Pages</h4>



<p>If a user tries to execute a search query without any results, they will get suggestions for similar search terms &#8211; similar to how Google will ask one searching for &#8220;Berrlin&#8221;, if they meant &#8220;Berlin&#8221;. Trying to identify suitable suggestions obviously costs resources and whether the suggestions are actually what a user wanted is by nature hit or miss &#8211; it&#8217;s suggestions after all. In the case of scrapers, suggesting alternative search queries offers them a never-ending stream of possible search queries to run and keep scraping the subdomain with &#8211; to nobody&#8217;s benefit (not even the scrapers&#8217;, as they likely got the same content with other search queries already).</p>



<p>We thus now use the same function used to identify whether PDFs should be generated for a user to check if search suggestions should be provided. It a user comes with a non-default browser language and resource use is high, no suggestions will be provided.</p>



<h4 class="wp-block-heading">Timelines</h4>



<p>Timeline pages as implemented in museum-digital&#8217;s frontend offer another source of endless links and search queries, as they link to further and further specifications of the time searched by. Again, an improvement already introduced months ago, was to better parse queries by time: If a user searches for objects that are linked to times &#8220;after 1920&#8221; and &#8220;after 1930&#8221;, the latter already includes the former. &#8220;After 1920 and after 1930&#8221; means exactly the same as &#8220;after 1930&#8221;. Which is one <a href="https://en.wikipedia.org/wiki/Join_(SQL)">join</a> instead of two &#8211; half the resource usage.</p>



<p>A minor improvement we noticed on the side was impact of automatic redirects in the timelines. Say, a user searches objects by their link to a given tag and then generates a timeline for said objects. If all objects were created in the 20th century, the timeline will automatically redirect so as to &#8220;zoom&#8221; into a more appropriate time scale than from the big bang to now. Until the last weekend, script execution was not stopped when that redirect happened &#8211; which means that all database queries for time time from the big bang to now were still executed even though the user never got to see them. That is now fixed.</p>



<h2 class="wp-block-heading">The Anti-Climactical Solution</h2>



<p>All of those changes got the frontend more or less stable. Problems with uploading images remained however. Finally, the only thing that helped was uninstalling libvips (which we use for image manipulation) and reinstalling it. That seems to have fixed the issues.</p>



<p>Especially as the number of requests from scrapers continues to increase, the current strategy outlined above seems to be fruitful. By reducing the use (and sometimes the availability altogether) of especially resource-intensive and &#8211; depending on the context &#8211; little useful functionalities, much stability and can be gained.</p>



<p>The update seems to finally be largely completed (aside from maybe some further fine-tuning of the PHP-FPM configuration) and museum-digital is stable despite the bot problem, while we haven&#8217;t had to take more drastic or costly actions yet &#8211; such as blocking or adding additional servers.</p>



<div class="wp-block-cgb-cc-by message-body" style="background-color:white;color:black"><img loading="lazy" decoding="async" src="https://blog.museum-digital.org/wp-content/plugins/creative-commons/includes/images/by.png" alt="CC" width="88" height="31"/><p><span class="cc-cgb-name">This content</span> is licensed under a <a href="https://creativecommons.org/licenses/by/4.0/">Creative Commons Attribution 4.0 International license.</a> <span class="cc-cgb-text"></span></p></div>
]]></content:encoded>
					
					<wfw:commentRss>https://blog.museum-digital.org/2025/12/09/updates-ai-scrapers-and-resilience/feed/</wfw:commentRss>
			<slash:comments>0</slash:comments>
		
		
		<post-thumbnail><url>https://blog.museum-digital.org/wp-content/uploads/2025/12/AI-gen-dove-and-robots-scaled.webp</url><width>600</width><height>411</height></post-thumbnail>	</item>
		<item>
		<title>State of Development, November 2025</title>
		<link>https://blog.museum-digital.org/2025/12/03/state-of-development-november-2025/</link>
					<comments>https://blog.museum-digital.org/2025/12/03/state-of-development-november-2025/#respond</comments>
		
		<dc:creator><![CDATA[Joshua Ramon Enslin]]></dc:creator>
		<pubDate>Wed, 03 Dec 2025 01:53:58 +0000</pubDate>
				<category><![CDATA[Development]]></category>
		<category><![CDATA[Frontend]]></category>
		<category><![CDATA[Importer]]></category>
		<category><![CDATA[musdb]]></category>
		<category><![CDATA[API]]></category>
		<category><![CDATA[New Features]]></category>
		<category><![CDATA[OAI-PMH]]></category>
		<guid isPermaLink="false">https://blog.museum-digital.org/?p=4575</guid>

					<description><![CDATA[Frontend musdb Importer Core Parser]]></description>
										<content:encoded><![CDATA[
<h2 class="wp-block-heading"><a href="https://en.about.museum-digital.org/software/frontend/">Frontend</a></h2>



<ul class="wp-block-list">
<li>On source / reference pages, linked objects are now sorted by the position within the source work on which they are referenced or which they do themselves reference</li>



<li>The target URL of the regular / unspecified search bar for objects now follows the new, prettier URL schema</li>



<li>Support for an <a href="https://www.openarchives.org/pmh/">OAI-PMH</a> API for object metadata
<ul class="wp-block-list">
<li>Standardized endpoint for aggregators seeking to retrieve data in batch</li>



<li>Metadata formats thus far supported:
<ul class="wp-block-list">
<li>LIDO</li>



<li>OAI-DC (mandatory)</li>
</ul>
</li>



<li>See also: <a href="https://blog.museum-digital.org/2025/11/24/making-interoperability-easy/">Blog</a></li>
</ul>
</li>



<li>PDFs are only generated for users with a browser set to a non-default language if load on the server is low
<ul class="wp-block-list">
<li>The resource use caused by AI bots scraping museum-digital has been growing and growing. Generally, we see bots included in our mission to enable access to cultural heritage. On the other hand, nobody is served if the service is bogged down by bots. One functionality that is commonly used among bots and resource intensive is the generation of PDFs for object pages. The same information can be loaded from the object page itself and printed to a PDF using the browser&#8217;s print option. There are thus rather few downsides to limiting access to PDF generation to timmes, when server load is low. So that&#8217;s what we did.</li>
</ul>
</li>



<li>Collection-specific ISIL identifiers are now also used in the LIDO API</li>



<li>Alternative numbers of an object can now be displayed on object pages
<ul class="wp-block-list">
<li>This includes tooltips for types of alternative numbers, that can be set by the museum on the institution-wide settings pages of musdb</li>
</ul>
</li>
</ul>



<h2 class="wp-block-heading"><a href="https://en.about.museum-digital.org/software/musdb/">musdb</a></h2>



<ul class="wp-block-list">
<li>Search for objects
<ul class="wp-block-list">
<li>Type-ahead search for languages (of the object&#8217;s content)</li>



<li>Search by object&#8217;s revision status (open, read-ony, archived, etc.)</li>
</ul>
</li>



<li>Batch editing of objects&#8217; revision status</li>



<li>Parameters of the full text search index were updated to improve the search of word compounds in German</li>
</ul>



<h3 class="wp-block-heading">Importer</h3>



<h4 class="wp-block-heading">Core</h4>



<ul class="wp-block-list">
<li>The dry-run mode now does not abort an import anymore, if an unmapped value is encountered. Unmapped entries are collected and displayed together afterwards.
<ul class="wp-block-list">
<li>This means, that unmapped entries can now much more easily be copied to and mapped in <a href="https://concordance.museum-digital.org/">concordance.museum-digital.org</a></li>
</ul>
</li>



<li>Support for the import of alternative numbers (of objects)</li>



<li>Support for the import of space hierarchies</li>
</ul>



<h4 class="wp-block-heading">Parser</h4>



<ul class="wp-block-list">
<li><code>AdlibXml</code>
<ul class="wp-block-list">
<li>Support for importing objects&#8217; alternative numbers</li>
</ul>
</li>



<li><code>CsvXml</code>
<ul class="wp-block-list">
<li>Support for importing objects&#8217; alternative numbers</li>
</ul>
</li>



<li><code>CsvLocations</code>
<ul class="wp-block-list">
<li>New parser for csv-based imports of space hierarchies</li>
</ul>
</li>



<li><code>ImageByInvno</code>
<ul class="wp-block-list">
<li>New setting: append_chars (Adds suffixes, that exist in the inventory number, but not in file names)</li>
</ul>
</li>
</ul>



<div class="wp-block-cgb-cc-by message-body" style="background-color:white;color:black"><img loading="lazy" decoding="async" src="https://blog.museum-digital.org/wp-content/plugins/creative-commons/includes/images/by.png" alt="CC" width="88" height="31"/><p><span class="cc-cgb-name">This content</span> is licensed under a <a href="https://creativecommons.org/licenses/by/4.0/">Creative Commons Attribution 4.0 International license.</a> <span class="cc-cgb-text"></span></p></div>
]]></content:encoded>
					
					<wfw:commentRss>https://blog.museum-digital.org/2025/12/03/state-of-development-november-2025/feed/</wfw:commentRss>
			<slash:comments>0</slash:comments>
		
		
		<post-thumbnail><url>https://blog.museum-digital.org/wp-content/uploads/2025/12/winter.webp</url><width>600</width><height>467</height></post-thumbnail>	</item>
		<item>
		<title>State of Development, October 2025</title>
		<link>https://blog.museum-digital.org/2025/11/25/state-of-development-october-2025/</link>
					<comments>https://blog.museum-digital.org/2025/11/25/state-of-development-october-2025/#respond</comments>
		
		<dc:creator><![CDATA[Joshua Ramon Enslin]]></dc:creator>
		<pubDate>Tue, 25 Nov 2025 16:55:09 +0000</pubDate>
				<category><![CDATA[Community]]></category>
		<category><![CDATA[Development]]></category>
		<category><![CDATA[Dissemination]]></category>
		<category><![CDATA[Frontend]]></category>
		<category><![CDATA[musdb]]></category>
		<category><![CDATA[Presentations]]></category>
		<category><![CDATA[New Features]]></category>
		<category><![CDATA[Object search (frontend)]]></category>
		<category><![CDATA[User interface]]></category>
		<guid isPermaLink="false">https://blog.museum-digital.org/?p=4564</guid>

					<description><![CDATA[A summary of recent updates and development around museum-digital in October 2025.]]></description>
										<content:encoded><![CDATA[
<h2 class="wp-block-heading">Development</h2>



<h3 class="wp-block-heading"><a href="https://en.about.museum-digital.org/software/frontend/">Frontend</a></h3>



<ul class="wp-block-list">
<li>Significantly reworked the display of transcriptions on object pages
<ul class="wp-block-list">
<li>Titles of transcriptions are now displayed
<ul class="wp-block-list">
<li>If none is set, the type of the transcription (original or translation) is used as a replacement</li>
</ul>
</li>



<li>Transcriptions are sorted by their titles</li>



<li>Improved the display of transcriptions in tiles
<ul class="wp-block-list">
<li>Problems with vertical scrolling are now solved</li>



<li>If only one transcription has been recorded, it will be displayed on the full width of the page</li>



<li>If there are more than two transcriptions for an object, they are folded in by default and can be unfolded later on</li>
</ul>
</li>
</ul>
</li>



<li>Batch export of object metadata via the API
<ul class="wp-block-list">
<li>Thus far available in JSON &amp; LIDO</li>



<li><a href="https://nat.museum-digital.de/swagger/#/object/jsonExportObjects">API documentation</a></li>



<li><a href="https://blog.museum-digital.org/2025/11/24/making-interoperability-easy/">See also</a></li>
</ul>
</li>



<li>Dots as a separator in floating point numbers for object measurements are replaced with a comma in languages that require that</li>



<li>Collection-specific ISIL IDs are used in the LIDO API</li>
</ul>



<h3 class="wp-block-heading"><a href="https://en.about.museum-digital.org/software/musdb/">musdb</a></h3>



<ul class="wp-block-list">
<li>Added a field for recording titles / names of transcriptions</li>



<li>Added the option to set collection-specific ISIL IDs</li>



<li>Setting object type tags via the improvement suggestions now correctly classifies the thus created link between object and tag</li>



<li>Additional shapes are now available
<ul class="wp-block-list">
<li>E.g.: round, square</li>
</ul>
</li>



<li>Object groups can now be filtered by whether they have a superordinate one or not</li>
</ul>



<h3 class="wp-block-heading">Dissemination</h3>



<ul class="wp-block-list">
<li>2025-10-08: <a href="https://www.jrenslin.de/talks/interoperabilitaet-schaffen-geschichten-aus-1001-importen-herbsttagung/">Presentation</a> at the Autumn Conference of the Working Group Documentation of the German Museum Association: &#8220;Interoperabilität schaffen &#8211; Geschichten aus 1001 Importen&#8221;
<ul class="wp-block-list">
<li><a href="https://files.museum-digital.org/de/Praesentationen/2025-10-08_1001-Importe_Herbsttagung-FG-Doku_JRE.pdf">PDF</a></li>



<li><a href="https://files.museum-digital.org/de/Praesentationen/2025-10-08_1001-Importe_Herbsttagung-FG-Doku_JRE.odp">ODP</a></li>
</ul>
</li>



<li>2025-10-14: <a href="https://www.jrenslin.de/talks/civers-2025/">Talk</a> on a workshop of the project <a href="https://www.dainst.org/forschung/projekte/citation-of-versioned-web-pages-by-pid-civers/5926">CiVers (Citation of Versioned Web Pages by PID)</a>
<ul class="wp-block-list">
<li><a href="https://files.museum-digital.org/de/Praesentationen/2025-10-14_museum-digital_Civers_JRE.pdf">PDF</a></li>



<li><a href="https://files.museum-digital.org/de/Praesentationen/2025-10-14_museum-digital_Civers_JRE.odp">ODP</a></li>
</ul>
</li>



<li>2025-10-17: <a href="https://verein.museum-digital.de/museum-digital-usertagung-2025/">museum-digital Usertagung 2025</a></li>
</ul>



<div class="wp-block-cgb-cc-by message-body" style="background-color:white;color:black"><img loading="lazy" decoding="async" src="https://blog.museum-digital.org/wp-content/plugins/creative-commons/includes/images/by.png" alt="CC" width="88" height="31"/><p><span class="cc-cgb-name">This content</span> is licensed under a <a href="https://creativecommons.org/licenses/by/4.0/">Creative Commons Attribution 4.0 International license.</a> <span class="cc-cgb-text"></span></p></div>
]]></content:encoded>
					
					<wfw:commentRss>https://blog.museum-digital.org/2025/11/25/state-of-development-october-2025/feed/</wfw:commentRss>
			<slash:comments>0</slash:comments>
		
		
		<post-thumbnail><url>https://blog.museum-digital.org/wp-content/uploads/2025/11/AI-gen-blog-202511-state-of-2025-10.png-scaled.webp</url><width>600</width><height>467</height></post-thumbnail>	</item>
		<item>
		<title>State of Dev, September 2025</title>
		<link>https://blog.museum-digital.org/2025/11/25/state-of-dev-september-2025/</link>
					<comments>https://blog.museum-digital.org/2025/11/25/state-of-dev-september-2025/#respond</comments>
		
		<dc:creator><![CDATA[Joshua Ramon Enslin]]></dc:creator>
		<pubDate>Tue, 25 Nov 2025 16:54:06 +0000</pubDate>
				<category><![CDATA[Development]]></category>
		<category><![CDATA[Frontend]]></category>
		<category><![CDATA[Importer]]></category>
		<category><![CDATA[musdb]]></category>
		<category><![CDATA[Presentations]]></category>
		<category><![CDATA[Controlled Vocabularies]]></category>
		<category><![CDATA[New Features]]></category>
		<guid isPermaLink="false">https://blog.museum-digital.org/?p=4554</guid>

					<description><![CDATA[Recent (technical) development around museum-digital in September 2025.]]></description>
										<content:encoded><![CDATA[
<h2 class="wp-block-heading">Development</h2>



<h3 class="wp-block-heading"><a href="https://en.about.museum-digital.org/software/frontend/">Frontend</a></h3>



<ul class="wp-block-list">
<li>Objects that are linked to a source as referencing the source or being referenced in it are now listed on the source&#8217;s page
<ul class="wp-block-list">
<li>Example: <a href="https://hessen.museum-digital.de/source/1950">&#8220;Novalis Schriften. Die Werke Friedrich von Hardenbergs. Historisch-kritische Ausgabe. Erster Band: Das dichterische Werk. 3. Auflage&#8221;</a></li>
</ul>
</li>



<li>The object page now displays the new transcription fields for notes, status, and type of the transcription</li>



<li>New types of classifications for the link between an object and a tag
<ul class="wp-block-list">
<li>Taxon</li>



<li>Topic</li>



<li>Mentioned subject (similar to the existing &#8220;display subject&#8221;)</li>
</ul>
</li>



<li>Dependencies
<ul class="wp-block-list">
<li>Updated OpenLayers to Version 10.6</li>
</ul>
</li>
</ul>



<figure class="wp-block-image size-large"><img loading="lazy" decoding="async" width="1024" height="576" src="https://blog.museum-digital.org/wp-content/uploads/2025/11/20251125_referenced-sources_en.png-1024x576.webp" alt="Screenshot: Objects referenced in source on source pages" class="wp-image-4548" srcset="https://blog.museum-digital.org/wp-content/uploads/2025/11/20251125_referenced-sources_en.png-1024x576.webp 1024w, https://blog.museum-digital.org/wp-content/uploads/2025/11/20251125_referenced-sources_en.png-300x169.webp 300w, https://blog.museum-digital.org/wp-content/uploads/2025/11/20251125_referenced-sources_en.png-1536x864.webp 1536w, https://blog.museum-digital.org/wp-content/uploads/2025/11/20251125_referenced-sources_en.png.webp 1920w" sizes="auto, (max-width: 1024px) 100vw, 1024px" /><figcaption class="wp-element-caption">Source pages in the public frontend / portals of museum-digital now display all objects linked to the source as either referencing the source or being referenced by their source.</figcaption></figure>



<h3 class="wp-block-heading"><a href="https://de.about.museum-digital.org/software/musdb/">musdb</a></h3>



<ul class="wp-block-list">
<li>Renaming vocabulary entries to blacklisted terms is now impossible
<ul class="wp-block-list">
<li>Someone managed to create an actor &#8220;unknown&#8221; by adding a different actor and renaming the new actor entry to &#8220;unknown&#8221; afterwards.</li>
</ul>
</li>



<li>Added a sidebar to object group overview pages for additional filtering options</li>



<li>New types of classifications for the link between an object and a tag
<ul class="wp-block-list">
<li>Taxon</li>



<li>Topic</li>



<li>Mentioned subject (similar to the existing &#8220;display subject&#8221;)</li>
</ul>
</li>



<li>New APIs for listing all vocabulary entries linked to the objects of a museum</li>



<li>Dependencies
<ul class="wp-block-list">
<li>Updated OpenLayers to Version 10.6</li>
</ul>
</li>
</ul>



<h3 class="wp-block-heading">Importer</h3>



<ul class="wp-block-list">
<li>Core
<ul class="wp-block-list">
<li>Deaccessions are now covered by the import tool</li>



<li>Recipients of deaccessions are linked to the address book / contact list</li>
</ul>
</li>



<li>Parser
<ul class="wp-block-list">
<li>CSVXML: The CSVXML parser now covers deaccessions as well</li>



<li>ImageByInvno: Added an option to ignore any characters before a given starting combination of characters (if there is a consistent start of the inventory number).</li>
</ul>
</li>
</ul>



<h3 class="wp-block-heading"><a href="https://csvxml.imports.museum-digital.org/">CSVXML</a></h3>



<ul class="wp-block-list">
<li>New fields
<ul class="wp-block-list">
<li>tag_related_identifier_type</li>



<li>tag_related_identifier</li>
</ul>
</li>
</ul>



<h2 class="wp-block-heading">Dissemination</h2>



<ul class="wp-block-list">
<li><a href="https://www.jrenslin.de/talks/von-museum-digital-zum-eigenen-online-katalog-ag-digitalisierung-mv-rlp/">Presentation</a> &#8220;Von museum-digital zum eigenen Online-Katalog&#8221; for the Working Group Digitization of the State Museum Association of Rhineland-Palatine
<ul class="wp-block-list">
<li><a href="https://files.museum-digital.org/de/Praesentationen/2025-09-10_Von-museum-digital-zum-eigenen-Online-Katalog_JRE.pdf">Slides: PDF</a></li>



<li><a href="https://files.museum-digital.org/de/Praesentationen/2025-09-10_Von-museum-digital-zum-eigenen-Online-Katalog_JRE.odp">Slides: ODP</a></li>
</ul>
</li>
</ul>



<div class="wp-block-cgb-cc-by message-body" style="background-color:white;color:black"><img loading="lazy" decoding="async" src="https://blog.museum-digital.org/wp-content/plugins/creative-commons/includes/images/by.png" alt="CC" width="88" height="31"/><p><span class="cc-cgb-name">This content</span> is licensed under a <a href="https://creativecommons.org/licenses/by/4.0/">Creative Commons Attribution 4.0 International license.</a> <span class="cc-cgb-text"></span></p></div>
]]></content:encoded>
					
					<wfw:commentRss>https://blog.museum-digital.org/2025/11/25/state-of-dev-september-2025/feed/</wfw:commentRss>
			<slash:comments>0</slash:comments>
		
		
		<post-thumbnail><url>https://blog.museum-digital.org/wp-content/uploads/2025/11/AI-gen-blog-202511-state-of-2025-09.png-scaled.webp</url><width>600</width><height>467</height></post-thumbnail>	</item>
		<item>
		<title>State of Dev, August 2025</title>
		<link>https://blog.museum-digital.org/2025/11/25/state-of-dev-august-2025/</link>
					<comments>https://blog.museum-digital.org/2025/11/25/state-of-dev-august-2025/#respond</comments>
		
		<dc:creator><![CDATA[Joshua Ramon Enslin]]></dc:creator>
		<pubDate>Tue, 25 Nov 2025 16:53:37 +0000</pubDate>
				<category><![CDATA[Development]]></category>
		<category><![CDATA[General]]></category>
		<category><![CDATA[musdb]]></category>
		<category><![CDATA[nodac]]></category>
		<category><![CDATA[API]]></category>
		<category><![CDATA[New Features]]></category>
		<category><![CDATA[Poster]]></category>
		<guid isPermaLink="false">https://blog.museum-digital.org/?p=4545</guid>

					<description><![CDATA[A summary of the recent updates and (technical) development around museum-digital in August 2025.]]></description>
										<content:encoded><![CDATA[
<p>The last months have been busy, off and on museum-digital. This is the first of three posts today on recent technical developments around museum-digital to continue the regular state of dev posts.</p>



<h2 class="wp-block-heading">Development</h2>



<h3 class="wp-block-heading"><a href="https://en.about.museum-digital.org/software/musdb/">musdb</a></h3>



<ul class="wp-block-list">
<li>Added a tool for the AI-aided detection of displayed subjects in images for tagging
<ul class="wp-block-list">
<li>Has to be explicitly turned on on a per-collection basis, as it makes sense only for certain types of objects (like paintings, drawings, photographs)</li>



<li>Usable within the tagging overlay on object editing pages</li>



<li>See also: <a href="https://blog.museum-digital.org/de/2025/08/27/automatische-erkennung-von-abgebildeten-elementen/">blog post about the feature (German)</a></li>
</ul>
</li>



<li>The maximum length of contents in the data field &#8220;edition&#8221; of literature entries has been extended to 50 characters</li>



<li>Uploaded PDFs may now be up to 40 MB large</li>



<li>New command line option to reset all permissions of a user account to the default provided by their user role</li>



<li>New event type: &#8220;Changed&#8221;</li>
</ul>



<h3 class="wp-block-heading"><a href="https://en.about.museum-digital.org/software/nodac/">nodac</a></h3>



<ul class="wp-block-list">
<li>Added AI-generated suggestions for tag definitions and translations of tag names in a sidebar
<ul class="wp-block-list">
<li>This is also re-used to identify duplicate tags</li>
</ul>
</li>
</ul>



<figure class="wp-block-image size-large"><img loading="lazy" decoding="async" width="1024" height="576" src="https://blog.museum-digital.org/wp-content/uploads/2025/11/20251125_screenshot-nodac-ai-sidebar.png-1024x576.webp" alt="Screenshot of a tag editing page in nodac, showing the new (August 2025) sidebar with AI-generated aids." class="wp-image-4547" srcset="https://blog.museum-digital.org/wp-content/uploads/2025/11/20251125_screenshot-nodac-ai-sidebar.png-1024x576.webp 1024w, https://blog.museum-digital.org/wp-content/uploads/2025/11/20251125_screenshot-nodac-ai-sidebar.png-300x169.webp 300w, https://blog.museum-digital.org/wp-content/uploads/2025/11/20251125_screenshot-nodac-ai-sidebar.png-1536x864.webp 1536w, https://blog.museum-digital.org/wp-content/uploads/2025/11/20251125_screenshot-nodac-ai-sidebar.png.webp 1920w" sizes="auto, (max-width: 1024px) 100vw, 1024px" /><figcaption class="wp-element-caption">The right sidebar of tag editing pages in nodac now features AI-generated tag descriptions as well as possible translations for the title. These are also used to identify possible duplicates of a tag entry (top right, underlined in purple).</figcaption></figure>



<h2 class="wp-block-heading">Dissemination</h2>



<ul class="wp-block-list">
<li>Poster presented at <a href="https://www.nfdi.de/cordi-2025/">CoRDI 2025</a> (<a href="https://web.archive.org/web/20250612013921/https://www.nfdi.de/cordi-2025/">Archived</a>) on August 27, 2025, in Aachen: <a href="https://www.jrenslin.de/talks/case-for-underhanded-methods-to-improve-research-data-cordi/">&#8220;To Educate or to Enforce &#8211; The Case for Underhanded Methods to Improve Research Data&#8221;</a>
<ul class="wp-block-list">
<li><a href="https://files.museum-digital.org/en/Posters/2025-08-26_To-Educate-or-to-Enforce_CoRDI2025-Aachen_JRE.pdf">PDF</a></li>



<li><a href="https://www.jrenslin.de/abstracts/cordi-2025-caseforunderhandedmethodsimproveresearchdata/">Abstract</a> / <a href="https://zenodo.org/records/16736291">Zenodo</a></li>
</ul>
</li>
</ul>



<div class="wp-block-cgb-cc-by message-body" style="background-color:white;color:black"><img loading="lazy" decoding="async" src="https://blog.museum-digital.org/wp-content/plugins/creative-commons/includes/images/by.png" alt="CC" width="88" height="31"/><p><span class="cc-cgb-name">This content</span> is licensed under a <a href="https://creativecommons.org/licenses/by/4.0/">Creative Commons Attribution 4.0 International license.</a> <span class="cc-cgb-text"></span></p></div>
]]></content:encoded>
					
					<wfw:commentRss>https://blog.museum-digital.org/2025/11/25/state-of-dev-august-2025/feed/</wfw:commentRss>
			<slash:comments>0</slash:comments>
		
		
		<post-thumbnail><url>https://blog.museum-digital.org/wp-content/uploads/2025/11/AI-gen-blog-202511-state-of-2025-08.png-scaled.webp</url><width>600</width><height>467</height></post-thumbnail>	</item>
		<item>
		<title>State of Dev, June &#038; July 2025</title>
		<link>https://blog.museum-digital.org/2025/08/23/state-of-dev-june-july-2025/</link>
					<comments>https://blog.museum-digital.org/2025/08/23/state-of-dev-june-july-2025/#respond</comments>
		
		<dc:creator><![CDATA[Joshua Ramon Enslin]]></dc:creator>
		<pubDate>Sat, 23 Aug 2025 21:12:41 +0000</pubDate>
				<category><![CDATA[Development]]></category>
		<category><![CDATA[Frontend]]></category>
		<category><![CDATA[Importer]]></category>
		<category><![CDATA[musdb]]></category>
		<category><![CDATA[Artificial Intelligence]]></category>
		<category><![CDATA[Minor Improvements]]></category>
		<category><![CDATA[New Features]]></category>
		<category><![CDATA[Object search (musdb)]]></category>
		<guid isPermaLink="false">https://blog.museum-digital.org/?p=4519</guid>

					<description><![CDATA[June and especially July were at first glance once again rather slow months in terms of development at museum-digital. Generally, the pace and type of development seems to have changed this year. Rather than doing many small improvements all over the place, there is less but larger and more labor intensive changes and new features. <a href="https://blog.museum-digital.org/2025/08/23/state-of-dev-june-july-2025/" class="more-link">...</a>]]></description>
										<content:encoded><![CDATA[
<p>June and especially July were at first glance once again rather slow months in terms of development at museum-digital.</p>



<p>Generally, the pace and type of development seems to have changed this year. Rather than doing many small improvements all over the place, there is less but larger and more labor intensive changes and new features. See for example the <a href="https://blog.museum-digital.org/2025/01/13/version-control-batch-transfer-between-data-fields-of-object-records/">versioning</a> in musdb (January), the tool for <a href="https://blog.museum-digital.org/de/2025/03/08/das-importieren-automatisieren/">automating imports</a> based on what others call a &#8220;hot folder&#8221; and the sort option to <a href="https://blog.museum-digital.org/2025/03/06/sort-by-beauty/">sort objects by their images&#8217; aesthetics score</a> (both presented here in March), and the new tool for suggesting formulations for object descriptions using large language models (June).</p>



<h2 class="wp-block-heading">July</h2>



<h3 class="wp-block-heading"><a href="https://en.about.museum-digital.org/software/frontend/">Frontend</a></h3>



<ul class="wp-block-list">
<li>Translation of the software to <a href="https://blog.museum-digital.org/2025/07/13/hindi/">Hindi</a> and <a href="https://blog.museum-digital.org/2025/07/02/browse-museum-digital-in-telugu/">Telugu</a></li>



<li>Grouping of tags by their relation to a given object<br><em>If an object is linked to more than 10 tags, the tag list of object pages quickly starts looking unorganized and messy. In such cases, the tags will now be displayed grouped by their relationship to the object (thus far: general tag, material, technique, object type, display subject)</em></li>
</ul>



<h3 class="wp-block-heading"><a href="https://en.about.museum-digital.org/software/musdb/">musdb</a></h3>



<h4 class="wp-block-heading">New Features</h4>



<ul class="wp-block-list">
<li>Export option for the specific LIDO as expected by the <a href="http://Koloniale Kontexte-Portal der Deutschen Digitalen Bibliothek">German Digital Library&#8217;s &#8220;colonial contexts&#8221; portal</a></li>
</ul>



<h4 class="wp-block-heading">Improvements &amp; Changes</h4>



<ul class="wp-block-list">
<li>The minimum length of fulltext search terms for object search parameters is now visibly enforced in the extended search user interface<br><em>To not overly burden the fulltext search server, any term in a full text search in musdb needs to be a minimum of two characters long. Thus far, shorter full text search parameters were simply ignored. This certainly was confusing at times. Since June, attempting to perform search queries with shorter search parameters is made impossible by a check in the extended search overlay.</em></li>



<li>Reception history of objects: Statements of the relevant position within a source can now be 40 characters long</li>



<li>Transcriptions
<ul class="wp-block-list">
<li>May now be up to 4000000 long</li>



<li>New fields: Notes on the transcription, status, aims</li>
</ul>
</li>
</ul>



<h4 class="wp-block-heading">Bugfixes</h4>



<ul class="wp-block-list">
<li>Fixed a bug in the batch editing of specific visibility flags for data fields on the addendum tab</li>
</ul>



<h2 class="wp-block-heading">Juni</h2>



<h3 class="wp-block-heading"><a href="https://en.about.museum-digital.org/software/frontend/">Frontend</a></h3>



<ul class="wp-block-list">
<li>Performance improvements
<ul class="wp-block-list">
<li>Object search now runs without a connection to the full text search server if no full text search parameter is set</li>



<li>If multiple search parameters for an earliest / latest time have been set, they are parsed and combined before being forwarded to the database</li>
</ul>
</li>



<li>Improvements in the deletion of temporarily created PDF files (PDF export)</li>



<li>Navigation has been translated to <a href="https://blog.museum-digital.org/de/2025/06/23/tamil/">Tamil</a></li>
</ul>



<h3 class="wp-block-heading"><a href="https://en.about.museum-digital.org/software/musdb/">musdb</a></h3>



<h4 class="wp-block-heading">New Features</h4>



<ul class="wp-block-list">
<li>Recipient of deaccessed objects can now be linked from within the address book</li>



<li><a href="https://blog.museum-digital.org/de/2025/06/19/ki-objektbeschreibungen/">New tool for AI-aided formulation o object descriptions (based on existing other metadata)</a></li>
</ul>



<h4 class="wp-block-heading">Improvements</h4>



<ul class="wp-block-list">
<li>Object search now runs without a connection to the full text search server if no full text search parameter is set</li>
</ul>



<h3 class="wp-block-heading"><a href="https://blog.museum-digital.org/category/development/importer-en-en/">Importer</a></h3>



<ul class="wp-block-list">
<li>The CSVXML parser has been extended to cover new event types and markings</li>



<li>Object groups automatically generated to group all objects of an import can now receive a description as set within the import configuration</li>
</ul>



<h3 class="wp-block-heading"><a href="https://en.about.museum-digital.org/software/nodac/">nodac</a></h3>



<p>The list of selectable languages for the navigation of nodac has now been restricted to those in which there is actually a complete translation</p>
]]></content:encoded>
					
					<wfw:commentRss>https://blog.museum-digital.org/2025/08/23/state-of-dev-june-july-2025/feed/</wfw:commentRss>
			<slash:comments>0</slash:comments>
		
		
		<post-thumbnail><url>https://blog.museum-digital.org/wp-content/uploads/2025/08/md-blog-palms.webp</url><width>600</width><height>411</height></post-thumbnail>	</item>
		<item>
		<title>State of Dev, May 2025</title>
		<link>https://blog.museum-digital.org/2025/07/07/state-of-dev-may-2025/</link>
		
		<dc:creator><![CDATA[Joshua Ramon Enslin]]></dc:creator>
		<pubDate>Mon, 07 Jul 2025 15:00:26 +0000</pubDate>
				<category><![CDATA[Development]]></category>
		<category><![CDATA[Frontend]]></category>
		<category><![CDATA[Importer]]></category>
		<category><![CDATA[musdb]]></category>
		<category><![CDATA[nodac]]></category>
		<category><![CDATA[API]]></category>
		<category><![CDATA[New Features]]></category>
		<category><![CDATA[Object search (musdb)]]></category>
		<guid isPermaLink="false">https://blog.museum-digital.org/?p=4430</guid>

					<description><![CDATA[An overview about recent developments around museum-digital in May 2025.]]></description>
										<content:encoded><![CDATA[
<h2 class="wp-block-heading"><a href="https://de.about.museum-digital.org/software/frontend/">Frontend</a></h2>



<h3 class="wp-block-heading">New Features</h3>



<ul class="wp-block-list">
<li>Tabs are available in the configuration when the site is installed as a Progressive Web App (<a href="https://de.wikipedia.org/wiki/Progressive_Webanwendung">PWA</a>) (<a href="https://developer.chrome.com/docs/capabilities/tabbed-application-mode">See also</a>)</li>
</ul>



<h3 class="wp-block-heading">Improvements</h3>



<ul class="wp-block-list">
<li>Duplicate search query parameters for objects associated with times before / after are removed.<br><em>Example: If one searches for &#8220;Objects after 1900, whose first recorded associated time is also after 2000&#8221;, this is a duplicate query. 2000 is always after 1900, so one of the two parameters can be removed.</em><br><em>Searches by time begin / end are also the foundation for the timeline. As the timeline is both beloved by web crawlers and resource-intensive to generate, this change significantly reduced server load.</em></li>



<li>Any links on timeline pages that do not reference single object pages are marked as <code>rel=nofollow</code><br><em>Which is to say that bots are told to ignore them.</em></li>
</ul>



<h2 class="wp-block-heading"><a href="https://de.about.museum-digital.org/software/musdb/">musdb</a></h2>



<h3 class="wp-block-heading">New Features</h3>



<ul class="wp-block-list">
<li>Tabs are available in the configuration when musdb is installed as a Progressive Web App (<a href="https://de.wikipedia.org/wiki/Progressive_Webanwendung">PWA</a>) (<a href="https://developer.chrome.com/docs/capabilities/tabbed-application-mode">See also</a>)</li>



<li>New search option for objects: &#8220;Can be published&#8221;<br><em>Aggregate search for objects that have not yet been published and which comply with the minimum requirements for publication.</em></li>



<li>Separated measurements can now be batch-edited<br><em>Available via &#8220;assign results&#8221;</em></li>



<li>Institution-wide setting to enforce the use of the user-defined object editing interface for all users of the institution</li>



<li>Exhibitions can now be searched via the API (<em><a href="https://de.handbook.museum-digital.info/musdb/API/index.html#/exhibition/exhibitionList">/exhibition/list</a></em>)</li>



<li>A user&#8217;s <a href="https://en.wikipedia.org/wiki/User_agent">user-agent</a> is checked whenever a page is requested. If it changed, the user will be logged out automatically.<br><em>Protection against <a href="https://owasp.org/www-community/attacks/Session_hijacking_attack">session hijacking</a>.</em></li>
</ul>



<h3 class="wp-block-heading">Improvements</h3>



<ul class="wp-block-list">
<li>Panorama images for tours through exhibitions / institutions are calculated down to 2400 px height rather than 1400 px height</li>



<li>The APIs for searching for entries in controlled vocabularies are now available via the main API<br><em>See e.g. </em><code><a href="https://de.handbook.museum-digital.info/musdb/API/index.html#/actor/actorSearchLinkedToObjects">/actor/search_linked_to_objects/{term}</a></code><em> , </em><code><a href="https://de.handbook.museum-digital.info/musdb/API/index.html#/actor/actorSearch">/actor/search/{term}</a></code><em> etc.</em></li>
</ul>



<h3 class="wp-block-heading">Bugfixes</h3>



<ul class="wp-block-list">
<li>Fix: &#8220;Visiting scientists&#8221; couldn&#8217;t open the &#8220;location&#8221; tab (this erroneously required museum-wide editing permissions)</li>



<li>Fix: User-defined defaults for descriptions of new objects were ignored when new objects were to be added</li>



<li>Fix: Thumbnails were displayed as duplicate images linked to exhibitions (tag &#8220;images&#8221;)</li>



<li>Fix: Prefixing via the batch editing was broken</li>
</ul>



<h2 class="wp-block-heading"><a href="https://blog.museum-digital.org/de/category/technik-design/importer-de/">Importer</a></h2>



<h3 class="wp-block-heading">New Features</h3>



<ul class="wp-block-list">
<li>New parser for CSV exports / imports from <a href="https://www.robotron-daphne.de/">Robotron Daphne</a></li>
</ul>



<h3 class="wp-block-heading">Improvements</h3>



<ul class="wp-block-list">
<li>CSVXML Parser
<ul class="wp-block-list">
<li>New literature-related fields (ISSN, editor, etc.) are now covered</li>



<li>References to wikidata are now respected as tags are imported</li>
</ul>
</li>



<li>The maximum length of any single given tag/keyword is centrally set to 95 characters</li>
</ul>



<h4 class="wp-block-heading">Bugfixes</h4>



<ul class="wp-block-list">
<li>Some more recently added fields from the &#8220;admininstration&#8221; tag of musdb&#8217;s object pages had been read by the importer but their transfer into the database had not been implemented thus far.</li>
</ul>



<h2 class="wp-block-heading"><a href="https://de.about.museum-digital.org/software/nodac/">nodac</a></h2>



<h4 class="wp-block-heading">New Features</h4>



<ul class="wp-block-list">
<li>Tabs are available in the configuration when nodac is installed as a Progressive Web App (<a href="https://de.wikipedia.org/wiki/Progressive_Webanwendung">PWA</a>) (<a href="https://developer.chrome.com/docs/capabilities/tabbed-application-mode">See also</a>)</li>
</ul>



<h2 class="wp-block-heading"><a href="https://csvxml.imports.museum-digital.org/">CSVXML</a></h2>



<h3 class="wp-block-heading">Improvements</h3>



<ul class="wp-block-list">
<li>Added new literature fields: type, editor, edition / issue, ISSN</li>
</ul>
]]></content:encoded>
					
		
		
		<post-thumbnail><url>https://blog.museum-digital.org/wp-content/uploads/2025/06/blog-may-2025-sunflowers-scaled.webp</url><width>600</width><height>343</height></post-thumbnail>	</item>
		<item>
		<title>State of Dev, March &#038; April 2025</title>
		<link>https://blog.museum-digital.org/2025/06/08/state-of-dev-march-april-2025/</link>
		
		<dc:creator><![CDATA[Joshua Ramon Enslin]]></dc:creator>
		<pubDate>Sun, 08 Jun 2025 17:04:00 +0000</pubDate>
				<category><![CDATA[Development]]></category>
		<category><![CDATA[Frontend]]></category>
		<category><![CDATA[Importer]]></category>
		<category><![CDATA[musdb]]></category>
		<category><![CDATA[Institution-specific settings]]></category>
		<category><![CDATA[LIDO]]></category>
		<category><![CDATA[Multilinguality]]></category>
		<category><![CDATA[New Features]]></category>
		<guid isPermaLink="false">https://blog.museum-digital.org/?p=4515</guid>

					<description><![CDATA[Frontend musdb Importer]]></description>
										<content:encoded><![CDATA[
<h2 class="wp-block-heading"><a href="https://en.about.museum-digital.org/software/frontend/">Frontend</a></h2>



<ul class="wp-block-list">
<li>A <a href="https://blog.museum-digital.org/2025/03/25/kannada/">Kannada</a> translation of the software is now available</li>



<li>Museums can now select to display their own citation notes in the menu for such on object pages<br>This is especially relevant in case the object <em>itself is to be cited (rather than its record online). </em></li>



<li>&#8220;Or&#8221; search queries can be combined within one search parameter for select attribute search types, e.g. places and tags. The Syntax is as follows: <code>place:61~1</code>
<ul class="wp-block-list">
<li>For now, this can only be used using the internal query language. As such, there is no corresponding option in the UI for search settings. On the other hand, the search option is thus available using the API.</li>



<li>This option is not available when searching for times or full events</li>
</ul>
</li>
</ul>



<h2 class="wp-block-heading"><a href="https://en.about.museum-digital.org/software/musdb/">musdb</a></h2>



<ul class="wp-block-list">
<li>Institution-wide settings 
<ul class="wp-block-list">
<li>Free text fields that double with similar controlled fields may now be hidden 
<ul class="wp-block-list">
<li>The acquisition of an object may e.g. be directly recorded on the context of a given object or as a separate acquisition process. Recording it as an acquisition process is slightly more labor-intensive, but allows a more fine-grained and accurate documentation. With the new setting, the data fields for recording acquisitions directly in the object context can be hidden from all users of a museum to ensure a uniform use of the preferred functionality.</li>



<li>Institution-specific notes on how to cite objects may now be recorded for display on published object pages.</li>
</ul>
</li>
</ul>
</li>



<li><a href="https://blog.museum-digital.org/2025/03/29/bringing-back-character-driven-search-for-inventory-numbers-in-musdb/">Search queries for inventory numbers are now character-level searches again (rather than following a fulltext search logic)</a></li>



<li>Some event types are incomplete, i.e. they cannot contain a place, or an actor, or a time. This incompleteness was handled differently between musdb, the import tool, and the CSVXML import preparation tool. Now, it is determined by a centralized list and thus similar across all of museum-digital.  </li>



<li>Refactoring of the administrative command line interface <br><em>This mainly concerns auto-correction tools, but also results in that exports for the quick export option are now automatically generated daily.</em><br></li>



<li>Separated measurements are now positioned at the very top of the &#8220;addendum&#8221; tab of object editing pages.</li>



<li>It is now possible to record web links for object groups</li>
</ul>



<h2 class="wp-block-heading"><a href="https://blog.museum-digital.org/category/development/importer-en-en/">Importer</a></h2>



<ul class="wp-block-list">
<li>The import tool can now be used as a harvester
<ul class="wp-block-list">
<li>First use case is a harvester for LIDO records delivered via an OAI-PMH interface.</li>
</ul>
</li>



<li>Externally stored images in formats other than JPG can now be imported </li>



<li>Significantly extended the LIDO parser to (among others) support the import of multilingual object data</li>
</ul>



<div class="wp-block-cgb-cc-by message-body" style="background-color:white;color:black"><img loading="lazy" decoding="async" src="https://blog.museum-digital.org/wp-content/plugins/creative-commons/includes/images/by.png" alt="CC" width="88" height="31"/><p><span class="cc-cgb-name">This content</span> is licensed under a <a href="https://creativecommons.org/licenses/by/4.0/">Creative Commons Attribution 4.0 International license.</a> <span class="cc-cgb-text"></span></p></div>
]]></content:encoded>
					
		
		
		<post-thumbnail><url>https://blog.museum-digital.org/wp-content/uploads/2025/06/blog-march-2025.webp</url><width>600</width><height>343</height></post-thumbnail>	</item>
		<item>
		<title>Bringing back character-driven search for inventory numbers in musdb</title>
		<link>https://blog.museum-digital.org/2025/03/29/bringing-back-character-driven-search-for-inventory-numbers-in-musdb/</link>
		
		<dc:creator><![CDATA[Joshua Ramon Enslin]]></dc:creator>
		<pubDate>Sat, 29 Mar 2025 22:59:32 +0000</pubDate>
				<category><![CDATA[Development]]></category>
		<category><![CDATA[Infrastructure]]></category>
		<category><![CDATA[musdb]]></category>
		<category><![CDATA[New Features]]></category>
		<category><![CDATA[Object search (musdb)]]></category>
		<guid isPermaLink="false">https://blog.museum-digital.org/?p=4367</guid>

					<description><![CDATA[If you search for &#8220;run&#8221;, you want to find entries (objects, blog posts, etc.), that mention &#8220;ran&#8221;. If you search for inventory numbers like &#8220;*1&#8221;, you want to find &#8220;0001&#8221;. These are fundamentally different categories of search. In the first case, you want to have a language-aware full-text search. In the latter case, you simply <a href="https://blog.museum-digital.org/2025/03/29/bringing-back-character-driven-search-for-inventory-numbers-in-musdb/" class="more-link">...</a>]]></description>
										<content:encoded><![CDATA[
<p>If you search for &#8220;run&#8221;, you want to find entries (objects, blog posts, etc.), that mention &#8220;ran&#8221;. If you search for inventory numbers like &#8220;*1&#8221;, you want to find &#8220;0001&#8221;. These are fundamentally different categories of search. In the first case, you want to have a language-aware full-text search. In the latter case, you simply want to work with characters. In technical terms: Inventory numbers are strings (groups of signs or characters), but not common &#8220;text&#8221;.</p>



<p>When musdb and museum-digital&#8217;s frontend received their last large-scale update to their respective object search functions around 2021, enabling actually complex search requests across almost all of the fields and data types linkable to objects, this was &#8211; among others &#8211; made possible by our use of <a href="https://manticoresearch.com/">Manticore</a>, a dedicated search server. Traditional relational databases excel at searches by indexes &#8211; pre-defined, known search parameters, that one prepares for searchability beforehand &#8211; while search servers like Manticore are reasonably good at that and excel at full-text searches.</p>



<p>In moving the search to Manticore, all searches in free text fields were defined as full-text searches. This was mostly the right decision: Full-text searches are the way to go with objects&#8217; titles, descriptions, and the like. But two specific types of fields posed a challenge, because &#8211; as indicated above &#8211; they usually run on formalized strings that operate very differently from prose: objects&#8217; locations and their inventory numbers. The software does not know about institution-specific and non-standardized rules of formalization, but users do. Hence, the preferred way for those specific types of fields is character-level searching.</p>



<p>For managing locations, we have in the meantime introduced <em>spaces</em> as a dedicated category capable of hierarchization as well as advanced features like the storage of sensor data. Object&#8217;s locations can now simply be expressed as a link to a space, which is by far the superior way when compared to the legacy free text field. If one does so, one can search for objects exactly in a given space, those that are located within it or its sub-spaces (e.g. a box in a depot room), etc. A migration tool from the legacy free-text field to the controlled spaces module is available through musdb&#8217;s dashboard. &#8220;Fixing&#8221; the issue of character-driven searches vs. full-text searches in locations is thus a least-priority issue &#8211; a better alternative is available anyway.</p>



<p>With inventory numbers on the other hand, there is no alternative to character-driven searching.</p>



<h2 class="wp-block-heading">Laying the Foundations: From MSQL to Manticore and (Somewhat) Back</h2>



<p>The basis for the expansion of search capabilities for objects was the introduction of a dedicated search server running Manticore. As the number of requests increased, this proved to be a blessing and &#8211; to some extent &#8211; a curse. Manticore offers more and better search options than a classic relational database, but it does not achieve the same level of stability. As long as queries remain index-bound and not concerned with text, the performance is roughly similar on our hardware: (both are about as quick, even with subqueries in MySQL; MySQL uses more resources, but is much more stable). If a query concerns a free-text field on the other hand, there is almost no comparison. Manticore offers a multitude of additional features at a great performance.</p>



<p>As stability had become an issue for a while, we adjusted the search to be able to use Manticore or MySQL as a backend, depending on which was more suitable in a specific context. In practice, this means that each search parameter is translated into a query string for Manticore and &#8211; if possible &#8211; for MySQL. If all search parameters have a MySQL equivalent, the search will be performed using the MySQL backend. Otherwise, Manticore will be used. </p>



<p>This simple way of negotiating which backend is more suitable works only as long as one of the alternatives (Manticore) supports all search options, while the other (MySQL) is preferrable in a subset of the search contexts. Which is to say, character-driven searches in inventory numbers break the negotiation logic &#8211; they work somewhat well in MySQL, but do not work in Manticore.</p>



<h2 class="wp-block-heading">Breaking the Logic / Mitigating Confusion</h2>



<p>Up to this weekend, all search options were compatible with each other:</p>



<ul class="wp-block-list">
<li>If one searches for all objects one has acc ess to, both Manticore and MySQL can handle the query. MySQL will be used.</li>



<li>If one searches for all &#8220;helmets&#8221; (tag) from &#8220;Europe&#8221; (place), both Manticore and MySQL can handle the query. MySQL will be used.</li>



<li>If one searches for &#8220;helmets&#8221; (full-text) from &#8220;Europe&#8221; (place), MySQL can only sufficiently handle the search by place, while Manticore can meet both search requirements. Manticore will be used.</li>
</ul>



<p>Character-driven searches by inventory numbers break that compatibility. If one were to search for objects for &#8220;helmets&#8221; (full-text) with inventory numbers starting with 1 (&#8220;1*&#8221;), the search parameter &#8220;helmets&#8221; could only be satisfied by Manticore, while the character-driven search by inventory numbers can only be satisfied by MySQL. Which is to say, the combined search cannot be executed.</p>



<p><strong>Due to popular demand, we introduced character-driven searches for inventory numbers back into musdb. </strong>As there is no way to sensibly combine all search parameters anymore, given our circumstance, we had to handle reduce the resulting confusion. For this, there are theoretically two ways.</p>



<p>The theoretically cleaner way would have been to disable the extension of search queries by full-text-focused parameters once an inventory number had been searched. As a full-text search by inventory number is theorecally still possible, the opposite direction (setting a full-text search first, then searching by inventory number) might still have been acceptable, as it would not have led to visibly different results. The basic idea of this solution would have been to prevent users from performing combined searches that are not possible in the targetted way. But if users actually managed in some way, the confusion would have been major. Worse yet, it would have been hard to explain &#8211; or rather, it would have been hard to find an appropiate spot in the UI for an explanation -, why certain search options are suddenly disabled.</p>



<p>The alternative route we chose is to allow users to do the impossible combinations, perform the search as best as we can (by transforming the character-driven search by inventory number into a full-text search), and aggressively warn about the likelihood of unexpected results when trying to perform such combined searches. This solution looks unpolished, but it is transparent about the imperfections of the software, and it allows users to find their own solutions to actually perform the combined searches they want. The simplest such solution would be to first search by inventory number, move all the objects into a watch list, and then search by the watch list and combine that search with the full-text search.</p>



<div class="wp-block-cgb-cc-by message-body" style="background-color:white;color:black"><img loading="lazy" decoding="async" src="https://blog.museum-digital.org/wp-content/plugins/creative-commons/includes/images/by.png" alt="CC" width="88" height="31"/><p><span class="cc-cgb-name">This content</span> is licensed under a <a href="https://creativecommons.org/licenses/by/4.0/">Creative Commons Attribution 4.0 International license.</a> <span class="cc-cgb-text"></span></p></div>
]]></content:encoded>
					
		
		
		<post-thumbnail><url>https://blog.museum-digital.org/wp-content/uploads/2025/03/ai-detective-warning-ukiyo-e-scaled.webp</url><width>600</width><height>343</height></post-thumbnail>	</item>
		<item>
		<title>State of Dev, February 2025</title>
		<link>https://blog.museum-digital.org/2025/03/25/state-of-dev-february-2025/</link>
		
		<dc:creator><![CDATA[Joshua Ramon Enslin]]></dc:creator>
		<pubDate>Tue, 25 Mar 2025 16:07:17 +0000</pubDate>
				<category><![CDATA[Development]]></category>
		<category><![CDATA[Frontend]]></category>
		<category><![CDATA[Importer]]></category>
		<category><![CDATA[musdb]]></category>
		<category><![CDATA[Bugfix]]></category>
		<category><![CDATA[Imports]]></category>
		<category><![CDATA[New Features]]></category>
		<guid isPermaLink="false">https://blog.museum-digital.org/?p=4362</guid>

					<description><![CDATA[In terms of development happening around museum-digital, February 2025 was a rather calm month. While more happened in the "machine room", immediately visible changes are mostly restricted to bugfixes. And a whole new tool.]]></description>
										<content:encoded><![CDATA[
<p>In terms of development happening around museum-digital, February 2025 was a rather calm month. While more happened in the &#8220;machine room&#8221;, immediately visible changes are mostly restricted to bugfixes. And a whole new tool.</p>



<p>Here, as so often, in a list format:</p>



<h2 class="wp-block-heading">Frontend</h2>



<ul class="wp-block-list">
<li><strong>Bugfix</strong>: The detailed description was not reflected in the object API even if set to be so via musdb</li>



<li><strong>Translation</strong>: The frontend is now available in <a href="https://blog.museum-digital.org/2025/03/25/kannada/">Kannada</a></li>
</ul>



<h2 class="wp-block-heading">musdb</h2>



<ul class="wp-block-list">
<li><strong>Bugfix:</strong> Fixed error in setting up new two factor authentication via TOTP (app-based 2fa)</li>



<li><strong>Bugfix:</strong> Symbols for image rotation switched (counter-clockwise symbol rotated clockwise and vice-versa)</li>



<li><strong>Feature</strong>: If the generation of a PDF catalogue is triggered via the sidebar of series editing pages, the objects will follow their order within the object group</li>
</ul>



<h2 class="wp-block-heading">Importer</h2>



<ul class="wp-block-list">
<li>If no explicit name/title for a resource (audio, video, PDF, 3D, externally hosted image) was supplied, there was no fallback. Now, the importer follows the same logic as musdb and will fall back to using the linked object&#8217;s object name as a resource title if none is supplied.</li>



<li>The <a href="https://csvxml.imports.museum-digital.org/">CSVXML</a> parser can now handle multiple objects per XML file.</li>
</ul>



<h2 class="wp-block-heading">Auto uploader</h2>



<p>The auto uploader is an entirely new tool. When first started, users will be asked to enter the necessary data for running imports. And they are asked for the path to a folder which will subsequently be monitored for automatic uploads.</p>



<p>Whenever the tool then detects suitable contents for an import in the folder, the data will be uploaded and automatically imported. Intermediate steps like the generation of a settings file for the import are covered by the tool.</p>



<p>The tool can thus be useful for museums that regularly import data formed in a consistent format &#8211; say, museums using museum-digital for publishing while mainting a separate collection management system. It should be uninteresting to those who spend much effort on preparing imports (e.g. via <a href="https://csvxml.imports.museum-digital.org/">CSVXML</a>) or who want to migrate to musdb altogether, as it only becomes useful in the case of regular use.</p>



<p>The code is available <a href="https://gitea.armuli.eu/museum-digital/museum-digital-webdav-uploader">here</a>, licensed under GPL v3.<br>See also the <a href="https://blog.museum-digital.org/de/2025/03/08/das-importieren-automatisieren/">more detailed German-language blog post about the auto uploader</a>.</p>



<div class="wp-block-cgb-cc-by message-body" style="background-color:white;color:black"><img loading="lazy" decoding="async" src="https://blog.museum-digital.org/wp-content/plugins/creative-commons/includes/images/by.png" alt="CC" width="88" height="31"/><p><span class="cc-cgb-name">This content</span> is licensed under a <a href="https://creativecommons.org/licenses/by/4.0/">Creative Commons Attribution 4.0 International license.</a> <span class="cc-cgb-text"></span></p></div>
]]></content:encoded>
					
		
		
		<post-thumbnail><url>https://blog.museum-digital.org/wp-content/uploads/2025/03/flower-in-water-AI-gen.webp</url><width>600</width><height>375</height></post-thumbnail>	</item>
		<item>
		<title>State of Dev, December 2024 &#038; January 2025</title>
		<link>https://blog.museum-digital.org/2025/02/14/state-of-dev-december-2024-january-2025/</link>
		
		<dc:creator><![CDATA[Joshua Ramon Enslin]]></dc:creator>
		<pubDate>Thu, 13 Feb 2025 23:48:42 +0000</pubDate>
				<category><![CDATA[Development]]></category>
		<category><![CDATA[Frontend]]></category>
		<category><![CDATA[Importer]]></category>
		<category><![CDATA[musdb]]></category>
		<category><![CDATA[nodac]]></category>
		<category><![CDATA[Batch editing]]></category>
		<category><![CDATA[Change log]]></category>
		<category><![CDATA[Controlled Vocabularies]]></category>
		<category><![CDATA[Imports]]></category>
		<category><![CDATA[New Features]]></category>
		<category><![CDATA[TEI]]></category>
		<category><![CDATA[Version control]]></category>
		<guid isPermaLink="false">https://blog.museum-digital.org/?p=4306</guid>

					<description><![CDATA[Once again a simple change log of the recent updates to museum-digital's different tools.]]></description>
										<content:encoded><![CDATA[
<h2 class="wp-block-heading">December 2024</h2>



<h3 class="wp-block-heading"><a href="https://de.about.museum-digital.org/software/frontend/">Frontend</a></h3>



<ul class="wp-block-list">
<li>Dates in <a href="https://de.wikipedia.org/wiki/Text_Encoding_Initiative">TEI</a> transcriptions are parsed, irrespective of whether <code>when=""</code> oder <code>when=''</code> was used</li>



<li>Notes for markings are now publicly displayed<br><em>This was missing thus far and is now implemented similar to how event notes are displayed. If a note exists, a small &#8220;[?]&#8221; appears behind the marking title line. Upon hovering over it, a tooltip appears with the relevant information.</em></li>
</ul>



<h3 class="wp-block-heading"><a href="https://de.about.museum-digital.org/software/musdb/">musdb</a></h3>



<ul class="wp-block-list">
<li>Names and descriptions of exhibitions and object groups can now be translated</li>



<li><a href="https://blog.museum-digital.org/2025/01/13/version-control-batch-transfer-between-data-fields-of-object-records/">Version control</a></li>



<li>Log of “current locations&#8221; of an object can be exported as a CSV file</li>



<li>Uploaded object images can now be hidden or published in one batch operation</li>



<li><a href="https://de.handbook.museum-digital.info/musdb/API/index.html">API</a> extended
<ul class="wp-block-list">
<li>(New functions)</li>



<li>Transfer object dimensions</li>



<li>List images and resources for an object</li>



<li>Image metadata</li>



<li>Publish / hide object images</li>
</ul>
</li>
</ul>



<h2 class="wp-block-heading">January 2025</h2>



<h3 class="wp-block-heading"><a href="https://de.about.museum-digital.org/software/frontend/">Frontend</a></h3>



<ul class="wp-block-list">
<li><em>Objects can now be sorted by the aesthetics of the thumbnail</em> (A dedicated blog post on this will follow soon)</li>
</ul>



<h2 class="wp-block-heading"><a href="https://de.about.museum-digital.org/software/musdb/">musdb</a></h2>



<ul class="wp-block-list">
<li><a href="https://blog.museum-digital.org/de/2025/01/13/versionierung-transfer-zwischen-datenfeldern/">Batch transfer between between free text fields of object data</a></li>



<li>Alignment of the maximum field length for notes on opening hours is now consistent between UI and database</li>



<li>Bug fixed with switching between institutions during consistency checks fixed (relevant only to users with administrative access to multiple museums)</li>



<li>Literature can now be searched by editors</li>
</ul>



<h3 class="wp-block-heading"><a href="https://blog.museum-digital.org/de/category/technik-design/importer-de/">Importer</a></h3>



<ul class="wp-block-list">
<li>Core
<ul class="wp-block-list">
<li>Automatic transformation of life dates for actors
<ul class="wp-block-list">
<li>Year of death “01.01.2012” now becomes “2012”, instead of 01.01 as before</li>
</ul>
</li>



<li>&#8220;?&#8221; and &#8220;(?)&#8221; are removed from the beginning and end of imported keywords</li>



<li>Various types of brackets in keyword names are converted to regular brackets</li>
</ul>
</li>



<li>Parser
<ul class="wp-block-list">
<li>Stricter internal implementation of settings, all imports can now implement the <code>start_at</code> setting
<ul class="wp-block-list">
<li>This is particularly useful for the repeated execution of imports that abort due to new, previously uncovered elements and other debugging.</li>
</ul>
</li>



<li>New parsers:
<ul class="wp-block-list">
<li><a href="https://de.wikipedia.org/wiki/Metadata_Object_Description_Schema">MODS</a> (mainly used in library contexts)</li>



<li>Parser for Exports from Faust for the <a href="https://st.museum-digital.de/institution/87">Händel-Haus</a></li>



<li>Parser for XML dumps from MuseumPlus Classic (MsSQL > XML export per table > Import)</li>



<li>Bugfixes
<ul class="wp-block-list">
<li>Field “Verwender” in Primus parser was mapped to production events</li>



<li>Material / technology are now imported correctly in the parser for BeeCollect exports for the Industrial Museums of Saxony</li>
</ul>
</li>
</ul>
</li>
</ul>
</li>



<li>„Frontend“
<ul class="wp-block-list">
<li>CLI now also has options for switching off the import of individual areas</li>



<li>Help text for command line tool</li>
</ul>
</li>
</ul>



<h3 class="wp-block-heading"><a href="https://de.about.museum-digital.org/software/nodac/">nodac</a></h3>



<ul class="wp-block-list">
<li>Splitting of keywords now also recognizes keywords that should be split into places, times, etc.
<ul class="wp-block-list">
<li>Bsp.: „Helm; Berlin“ > Schlagwort „Helm“ + Ort „Berlin“</li>



<li>Example: “helmet; Berlin” > keyword “helmet” + place “Berlin”</li>
</ul>
</li>



<li>When searching for keywords with ambiguous names, both keywords and generally ambiguous terms are now taken into account</li>



<li>Times can now be merged with others directly from the time edit page</li>
</ul>



<div class="wp-block-cgb-cc-by message-body" style="background-color:white;color:black"><img loading="lazy" decoding="async" src="https://blog.museum-digital.org/wp-content/plugins/creative-commons/includes/images/by.png" alt="CC" width="88" height="31"/><p><span class="cc-cgb-name">This content</span> is licensed under a <a href="https://creativecommons.org/licenses/by/4.0/">Creative Commons Attribution 4.0 International license.</a> <span class="cc-cgb-text"></span></p></div>
]]></content:encoded>
					
		
		
		<post-thumbnail><url>https://blog.museum-digital.org/wp-content/uploads/2025/02/typing-2025-01.avif</url><width>600</width><height>336</height></post-thumbnail>	</item>
		<item>
		<title>Version Control &#038; Batch Transfer Between Data Fields of Object Records</title>
		<link>https://blog.museum-digital.org/2025/01/13/version-control-batch-transfer-between-data-fields-of-object-records/</link>
					<comments>https://blog.museum-digital.org/2025/01/13/version-control-batch-transfer-between-data-fields-of-object-records/#respond</comments>
		
		<dc:creator><![CDATA[Joshua Ramon Enslin]]></dc:creator>
		<pubDate>Mon, 13 Jan 2025 15:05:59 +0000</pubDate>
				<category><![CDATA[Development]]></category>
		<category><![CDATA[musdb]]></category>
		<category><![CDATA[Batch editing]]></category>
		<category><![CDATA[New Features]]></category>
		<category><![CDATA[Object editing (musdb)]]></category>
		<category><![CDATA[Object search (musdb)]]></category>
		<category><![CDATA[Version control]]></category>
		<guid isPermaLink="false">https://blog.museum-digital.org/?p=4269</guid>

					<description><![CDATA[The new year 2025 comes with two long-awaited new features in musdb: detailed version control of object data and an option to batch transfer object data from one free text field to another. Version control Until a few days ago, a central and sorely missed feature in musdb was a detailed version history of the <a href="https://blog.museum-digital.org/2025/01/13/version-control-batch-transfer-between-data-fields-of-object-records/" class="more-link">...</a>]]></description>
										<content:encoded><![CDATA[
<p>The new year 2025 comes with two long-awaited new features in musdb: detailed version control of object data and an option to batch transfer object data from one free text field to another.</p>



<h2 class="wp-block-heading">Version control</h2>



<p>Until a few days ago, a central and sorely missed feature in musdb was a detailed version history of the data records. For example to be able to trace and restore data after an attempt at batch processing gone wrong or careless errors when deleting field contents.</p>



<p>Such a view of all previous versions of an object record since the start of recording (May 2024) can now be accessed via the &#8220;record history&#8221; tab when viewing and editing an object in musdb. A new &#8220;Open versioning&#8221; button appears right at the top.</p>



<figure class="wp-block-image size-full"><img loading="lazy" decoding="async" width="1021" height="600" src="https://blog.museum-digital.org/wp-content/uploads/2025/01/20250113_musdb-Versioning_1_EN.avif" alt="musdb: Versioning via record history" class="wp-image-4267" srcset="https://blog.museum-digital.org/wp-content/uploads/2025/01/20250113_musdb-Versioning_1_EN.avif 1021w, https://blog.museum-digital.org/wp-content/uploads/2025/01/20250113_musdb-Versioning_1_EN-300x176.jpg 300w" sizes="auto, (max-width: 1021px) 100vw, 1021px" /><figcaption class="wp-element-caption">The detailed version history can be accessed via a new button at the top of the “record history” tab when editing an object.
</figcaption></figure>



<p>Clicking on it opens an overlay in which the various versions are listed as a table. The various aspects of the object data set are divided into different tabs and therefore different tables, e.g. for basic information, administrative information, links to collections, keywords, etc.</p>



<figure class="wp-block-image size-large"><img loading="lazy" decoding="async" width="1024" height="634" src="https://blog.museum-digital.org/wp-content/uploads/2025/01/20250113_musdb-Versioning_2_EN-1024x634.jpg" alt="musdb: Versioning overlay" class="wp-image-4268" srcset="https://blog.museum-digital.org/wp-content/uploads/2025/01/20250113_musdb-Versioning_2_EN-1024x634.jpg 1024w, https://blog.museum-digital.org/wp-content/uploads/2025/01/20250113_musdb-Versioning_2_EN-300x186.jpg 300w, https://blog.museum-digital.org/wp-content/uploads/2025/01/20250113_musdb-Versioning_2_EN.avif 1308w" sizes="auto, (max-width: 1024px) 100vw, 1024px" /><figcaption class="wp-element-caption">The version history of an object is presented in a table view in the overlay. Cells whose values have changed in one version respective to the previous one are displayed with dashed borders. Empty cells are dashed sideways. In the screenshot, the most recent version can be seen at the top (empty cell in the column &#8220;end&#8221;). Between lines 2 and 3, the object description was significantly shortened, leading to a reduction in the quality index in the most recent version (top row).</figcaption></figure>



<h2 class="wp-block-heading">Batch transfer</h2>



<p>A second frequently requested feature &#8211; especially after importing &#8211; has been the option to transfer the content of one data field of object records to another. If, for example, the information previously stored in the non-publishable data field &#8220;object history&#8221; is to be stored in the &#8220;detailed description&#8221; in the future and published there, the transfer from one field to the other can now be carried out for hundreds of objects with the pressing of a few buttons. Like all other &#8220;Global Change&#8221; options, the batch transfer between different data fields always refers to the results of an object search. The function is available via the sidebar of the object overview once a search for objects has been executed.</p>



<figure class="wp-block-image size-large"><img loading="lazy" decoding="async" width="1024" height="487" src="https://blog.museum-digital.org/wp-content/uploads/2025/01/20250113_musdb-Transfer_1_EN-1024x487.jpg" alt="musdb: Batch transfer in object overview" class="wp-image-4265" srcset="https://blog.museum-digital.org/wp-content/uploads/2025/01/20250113_musdb-Transfer_1_EN-1024x487.jpg 1024w, https://blog.museum-digital.org/wp-content/uploads/2025/01/20250113_musdb-Transfer_1_EN-300x143.jpg 300w, https://blog.museum-digital.org/wp-content/uploads/2025/01/20250113_musdb-Transfer_1_EN.avif 1385w" sizes="auto, (max-width: 1024px) 100vw, 1024px" /><figcaption class="wp-element-caption">If a search filter is set for objects, various options for export and batch processing appear in the bottom right of the sidebar. A new option &#8220;batch transfer&#8221; can be found at the very bottom of the list.</figcaption></figure>



<figure class="wp-block-image size-full"><img loading="lazy" decoding="async" width="996" height="780" src="https://blog.museum-digital.org/wp-content/uploads/2025/01/20250113_musdb-Transfer_2_EN.avif" alt="musdb: batch transfer overlay" class="wp-image-4266" srcset="https://blog.museum-digital.org/wp-content/uploads/2025/01/20250113_musdb-Transfer_2_EN.avif 996w, https://blog.museum-digital.org/wp-content/uploads/2025/01/20250113_musdb-Transfer_2_EN-300x235.jpg 300w" sizes="auto, (max-width: 996px) 100vw, 996px" /><figcaption class="wp-element-caption">Screenshot of the new option for transferring object data in batches from one free text field to another. In addition to the free text fields for the object, the two &#8220;special sources&#8221; &#8220;separate dimensions&#8221; and &#8220;separate information: material and technology&#8221; can be selected, as shown in the screenshot. When transferring between data fields, the content of the target field can be overwritten with the content of the source data field, or the latter can be prepended or appended to the target field&#8217;s content.</figcaption></figure>



<p>The batch transfer between fields based search results can also be used via musdb&#8217;s API. For this, a new API route <code>/object/transfer_by_search/{mode}</code> has been added.</p>
]]></content:encoded>
					
					<wfw:commentRss>https://blog.museum-digital.org/2025/01/13/version-control-batch-transfer-between-data-fields-of-object-records/feed/</wfw:commentRss>
			<slash:comments>0</slash:comments>
		
		
		<post-thumbnail><url>https://blog.museum-digital.org/wp-content/uploads/2025/01/20250113_musdb-Versioning_2_EN.avif</url><width>600</width><height>372</height></post-thumbnail>	</item>
		<item>
		<title>State of Development, November 2024: &#8220;Real&#8221; separated Measurements and a Better Recognition of Tags, Places, etc.</title>
		<link>https://blog.museum-digital.org/2025/01/13/state-of-development-november-2024-real-separated-measurements-and-a-better-recognition-of-tags-places-etc/</link>
		
		<dc:creator><![CDATA[Joshua Ramon Enslin]]></dc:creator>
		<pubDate>Mon, 13 Jan 2025 13:41:45 +0000</pubDate>
				<category><![CDATA[Development]]></category>
		<category><![CDATA[Frontend]]></category>
		<category><![CDATA[Importer]]></category>
		<category><![CDATA[musdb]]></category>
		<category><![CDATA[Change log]]></category>
		<category><![CDATA[New Features]]></category>
		<guid isPermaLink="false">https://blog.museum-digital.org/?p=4255</guid>

					<description><![CDATA[A short overview in list form of the recent technical updates around museum-digital, as of November 2024.]]></description>
										<content:encoded><![CDATA[
<p>musdb</p>



<ul class="wp-block-list">
<li>Basic re-implementation of separated measurements for objects (tab: &#8220;addendum&#8221;)</li>



<li>The type of measurement (width, length, etc.) is now managed using a controlled list, which can easily be extended. This also allows for measurement types of different levels of specificity (&#8220;width&#8221; vs. &#8220;width of socle&#8221;)</li>



<li>The new implementation allows users to identify whether a measurement is exact and add notes for each measurement</li>



<li>Measured values now need to be entered as a floating point number &#8211; consistent search for objects smaller or larger than a given size is thus made possible (previously only available for entries where the system could deduce a numeric value from whatever had been entered)</li>



<li>Unique naming components of entered tags are automatically parsed into a relation type. German: &#8220;Apfel (Motiv)&#8221; is automatically split into the tag &#8220;Apple&#8221; and the relation type &#8220;display subject&#8221;.</li>



<li>Users entering a tag with such a naming component that is known to belong to another vocabulary will have their input auto-corrected to reflect both. Entering the tag &#8220;Berlin (Motiv)&#8221; will be auto-corrected to a link to a &#8220;displayed place&#8221; &#8220;Berlin&#8221;.</li>



<li>The list of spaces that can be linked to an object as its current location is now sorted alphabetically</li>
</ul>



<p>Frontend</p>



<ul class="wp-block-list">
<li>Re-implemented separated Measurements</li>
</ul>



<p>Import</p>



<ul class="wp-block-list">
<li>The import tool now also uses the new implementation of separated measurements</li>



<li>Unique naming components of entered tags are automatically parsed into a relation type. German: &#8220;Apfel (Motiv)&#8221; is automatically split into the tag &#8220;Apple&#8221; and the relation type &#8220;display subject&#8221;.</li>



<li>Users entering a tag with such a naming component that is known to belong to another vocabulary will have their input auto-corrected to reflect both. Entering the tag &#8220;Berlin (Motiv)&#8221; will be auto-corrected to a link to a &#8220;displayed place&#8221; &#8220;Berlin&#8221;.</li>



<li>Links between two objects are now imported using the dedicated data type for this purpose. They had previously been imported as regular web links.</li>
</ul>



<p>nodac</p>



<ul class="wp-block-list">
<li>When merging two entries, links between the entries and collections, exhibitions, etc. are now reflected and rewritten</li>



<li>previously, links to such data types prevented the completion of the merge</li>



<li>If an entry&#8217;s name is marked to always belong to e.g. a tag, the buttons for transferring the entry to the actor, place, or time vocabularies are now hidden.</li>



<li>If an entry is moved between vocabularies, links between the entry and objects as &#8220;display subject&#8221; / &#8220;displayed place&#8221; / &#8220;displayed actor&#8221; are reflected and translated into new links using the appropriate link type.</li>



<li>A new context menu on overview pages allows for a quick access to functionalities like the splitting of entries</li>
</ul>
]]></content:encoded>
					
		
		
		<post-thumbnail><url>https://blog.museum-digital.org/wp-content/uploads/2025/01/a-1.avif</url><width>600</width><height>338</height></post-thumbnail>	</item>
		<item>
		<title>State of Development, October 2024: Searching Objects Currently On Exhibition, Linking Location and Acquisition of Literature</title>
		<link>https://blog.museum-digital.org/2024/11/06/state-of-development-october-2024-searching-objects-currently-on-exhibition-linking-location-and-acquisition-of-literature/</link>
		
		<dc:creator><![CDATA[Joshua Ramon Enslin]]></dc:creator>
		<pubDate>Wed, 06 Nov 2024 12:58:01 +0000</pubDate>
				<category><![CDATA[Community]]></category>
		<category><![CDATA[Development]]></category>
		<category><![CDATA[Frontend]]></category>
		<category><![CDATA[Importer]]></category>
		<category><![CDATA[musdb]]></category>
		<category><![CDATA[API]]></category>
		<category><![CDATA[Change log]]></category>
		<category><![CDATA[Object images]]></category>
		<category><![CDATA[Object search (musdb)]]></category>
		<guid isPermaLink="false">https://blog.museum-digital.org/?p=4194</guid>

					<description><![CDATA[After the blog has been very quiet this year with regard to the technical development of museum-digital, we are now trying to publish the summaries of new developments &#8211; enriched with screenshots &#8211; that are prepared for the monthly “regional administrators” rounds in Germany anyway. These are in the form of listings, and this is <a href="https://blog.museum-digital.org/2024/11/06/state-of-development-october-2024-searching-objects-currently-on-exhibition-linking-location-and-acquisition-of-literature/" class="more-link">...</a>]]></description>
										<content:encoded><![CDATA[
<p>After the blog has been very quiet this year with regard to the technical development of museum-digital, we are now trying to publish the summaries of new developments &#8211; enriched with screenshots &#8211; that are prepared for the monthly “regional administrators” rounds in Germany anyway. </p>



<p>These are in the form of listings, and this is how it should be here too.</p>



<h2 class="wp-block-heading"><a href="https://de.about.museum-digital.org/software/frontend/">Frontend</a></h2>



<h3 class="wp-block-heading">Features &amp; Improvements</h3>



<ul class="wp-block-list">
<li>Some improvements in background scripts, especially better handling of timeouts when calculating “Similar objects” in very large instances</li>



<li>Contributors, linked locations and times for an object group are now listed alphabetically by name</li>



<li>Table headers for event components (who, when, where) are now only displayed in the A4 PDF if there is also content for the row</li>



<li>New search option for object searches: “Is currently on display”</li>



<li>Links to the Themator now use the new URL scheme of the Themator<br>(<a href="https://themator.museum-digital.de/t/690">https://themator.museum-digital.de/t/690</a> instead of <a href="https://themator.museum-digital.de/ausgabe/showthema.php?m_tid=690&amp;tid=690">https://themator.museum-digital.de/ausgabe/showthema.php?m_tid=690&amp;tid=690</a>)</li>
</ul>



<figure class="wp-block-gallery has-nested-images columns-default is-cropped wp-block-gallery-1 is-layout-flex wp-block-gallery-is-layout-flex">
<figure class="wp-block-image size-large"><img loading="lazy" decoding="async" width="826" height="459" data-id="4185" src="https://blog.museum-digital.org/wp-content/uploads/2024/11/frontend_Suche_verfeinern.png.avif" alt="Screenshot aus dem Frontend von museum-digital." class="wp-image-4185" srcset="https://blog.museum-digital.org/wp-content/uploads/2024/11/frontend_Suche_verfeinern.png.avif 826w, https://blog.museum-digital.org/wp-content/uploads/2024/11/frontend_Suche_verfeinern.png-300x167.avif 300w" sizes="auto, (max-width: 826px) 100vw, 826px" /><figcaption class="wp-element-caption">The new filter option “Currently on display” in the overlay for the advanced search for objects in the frontend of museum -digital.</figcaption></figure>



<figure class="wp-block-image size-large"><img loading="lazy" decoding="async" width="548" height="583" data-id="4184" src="https://blog.museum-digital.org/wp-content/uploads/2024/11/frontend_MItwirkende_sortiert.png.avif" alt="Screenshot aus dem Frontend von museum-digital." class="wp-image-4184" srcset="https://blog.museum-digital.org/wp-content/uploads/2024/11/frontend_MItwirkende_sortiert.png.avif 548w, https://blog.museum-digital.org/wp-content/uploads/2024/11/frontend_MItwirkende_sortiert.png-282x300.avif 282w" sizes="auto, (max-width: 548px) 100vw, 548px" /><figcaption class="wp-element-caption">The contributors to an object group are now sorted alphabetically by sorted by name .</figcaption></figure>
</figure>



<h3 class="wp-block-heading">Bugfixes</h3>



<ul class="wp-block-list">
<li>Error when searching for controlled list terms that contained multiple spaces via the “Refine search” overlay (search for license “Public Domain Mark”)</li>



<li>Exactness setting in the “refine search” overlay was not transferred to the actual search query</li>



<li>Simple embedding of an object (analogous to YouTube videos, for example; accessible via the “Cite” menu of an object page) had various errors / now works again</li>
</ul>



<h2 class="wp-block-heading"><a href="https://de.about.museum-digital.org/software/musdb/">musdb</a></h2>



<h3 class="wp-block-heading">Features &amp; Improvements</h3>



<ul class="wp-block-list">
<li>In the API documentation of musdb there is now a note that the frontend also has an API
<ul class="wp-block-list">
<li>Frontend API
<ul class="wp-block-list">
<li>You do not need to authenticate yourself to use the frontend API</li>



<li>The frontend API tends to be faster and easier to use</li>



<li>Is read-only</li>
</ul>
</li>



<li>musdb API
<ul class="wp-block-list">
<li>Can do more: Can also see non-public stocks and fields / data types</li>



<li>Is much more granular (more queries for the same data, but you likely get exactly the data you are looking for instead of e.g. all data known about a given object)</li>



<li>Can be used for writing data</li>
</ul>
</li>
</ul>
</li>



<li>Suggestion lists when searching for vocabulary terms in the side column of the object search page have been revised
<ul class="wp-block-list">
<li>Tooltips appear when hovering over</li>



<li>Implementation in Vanilla JS, removing jQuery</li>



<li>(this means significantly better performance of the search results list in list format, because jQuery no longer needs to be loaded)</li>
</ul>
</li>
</ul>



<ul class="wp-block-list">
<li></li>
</ul>



<figure class="wp-block-image size-large"><img loading="lazy" decoding="async" width="416" height="1024" src="https://blog.museum-digital.org/wp-content/uploads/2024/11/musdb_Tooltip_in_Auswahlliste.png-416x1024.avif" alt="Screenshot aus musdb." class="wp-image-4188" srcset="https://blog.museum-digital.org/wp-content/uploads/2024/11/musdb_Tooltip_in_Auswahlliste.png-416x1024.avif 416w, https://blog.museum-digital.org/wp-content/uploads/2024/11/musdb_Tooltip_in_Auswahlliste.png-122x300.avif 122w, https://blog.museum-digital.org/wp-content/uploads/2024/11/musdb_Tooltip_in_Auswahlliste.png.avif 714w" sizes="auto, (max-width: 416px) 100vw, 416px" /><figcaption class="wp-element-caption">The suggestion lists for places, times, persons and keywords in the quick search function of the object search mask have been re-implemented. The main visible benefit is that explanations now appear directly when hovering over the terms in the list.</figcaption></figure>



<ul class="wp-block-list">
<li>User page / Login
<ul class="wp-block-list">
<li>Log of logins now also with IP and user agents</li>



<li>Login via login persisted in the browser (“Remember me”) is logged and displayed</li>



<li>All browsers permanently logged in via cookie are forced to log in again after a password change</li>



<li>New option to invalidate all remembered logins on other devices (browser must be logged in again)</li>
</ul>
</li>
</ul>



<figure class="wp-block-image size-large"><img loading="lazy" decoding="async" width="1024" height="694" src="https://blog.museum-digital.org/wp-content/uploads/2024/11/musdb_Login_log.png-1024x694.avif" alt="Screenshot aus musdb." class="wp-image-4186" srcset="https://blog.museum-digital.org/wp-content/uploads/2024/11/musdb_Login_log.png-1024x694.avif 1024w, https://blog.museum-digital.org/wp-content/uploads/2024/11/musdb_Login_log.png-300x203.avif 300w, https://blog.museum-digital.org/wp-content/uploads/2024/11/musdb_Login_log.png-1536x1041.avif 1536w, https://blog.museum-digital.org/wp-content/uploads/2024/11/musdb_Login_log.png.avif 1762w" sizes="auto, (max-width: 1024px) 100vw, 1024px" /><figcaption class="wp-element-caption">The “Login log” in the account settings can be used to track when and in what context one&#8217;s own user account was accessed in musdb. This allows for the identification of account takeovers by third parties. Newly logged and/or displayed are: IP address used to log in, the user agent (identification of the browser) and whether the browser was automatically logged in via a permanent login cookie (“Remember me”).</figcaption></figure>



<figure class="wp-block-image size-full"><img loading="lazy" decoding="async" width="942" height="678" src="https://blog.museum-digital.org/wp-content/uploads/2024/11/musdb_User_Erinnerte_Logins_loeschen.png.avif" alt="Screenshot aus musdb." class="wp-image-4189" srcset="https://blog.museum-digital.org/wp-content/uploads/2024/11/musdb_User_Erinnerte_Logins_loeschen.png.avif 942w, https://blog.museum-digital.org/wp-content/uploads/2024/11/musdb_User_Erinnerte_Logins_loeschen.png-300x216.avif 300w" sizes="auto, (max-width: 942px) 100vw, 942px" /><figcaption class="wp-element-caption">A new button in the toolbar of the account settings in musdb allows you to log out all permanently logged in browsers / devices from your own account.</figcaption></figure>



<ul class="wp-block-list">
<li>Object
<ul class="wp-block-list">
<li>More restrictions for the publication of object data records.</li>



<li>An object can no longer be published if:
<ul class="wp-block-list">
<li>&#8230; the object name is the same as the object description</li>



<li>&#8230; the description contains the character string “lorem ipsum”</li>
</ul>
</li>



<li>When object entries are unpublished / hidden, the images linked to the image are renamed (thus invalidating links to the images). When publishing the object again, this is reversed so that existing links work again.</li>



<li>Spaces in selection lists are now listed alphabetically as the actual location when linking</li>
</ul>
</li>



<li>Literature
<ul class="wp-block-list">
<li>Acquisitions can now be linked to literature
<ul class="wp-block-list">
<li>Previous owners etc. can thus be linked to a literature entry</li>
</ul>
</li>



<li>Spaces (actual location) can be linked to literature</li>
</ul>
</li>
</ul>



<figure class="wp-block-image size-large"><img loading="lazy" decoding="async" width="1024" height="504" src="https://blog.museum-digital.org/wp-content/uploads/2024/11/musdb_Reiter_Verwaltung.png-1024x504.avif" alt="Screenshot aus musdb." class="wp-image-4187" srcset="https://blog.museum-digital.org/wp-content/uploads/2024/11/musdb_Reiter_Verwaltung.png-1024x504.avif 1024w, https://blog.museum-digital.org/wp-content/uploads/2024/11/musdb_Reiter_Verwaltung.png-300x148.avif 300w, https://blog.museum-digital.org/wp-content/uploads/2024/11/musdb_Reiter_Verwaltung.png-1536x756.avif 1536w, https://blog.museum-digital.org/wp-content/uploads/2024/11/musdb_Reiter_Verwaltung.png.avif 1800w" sizes="auto, (max-width: 1024px) 100vw, 1024px" /><figcaption class="wp-element-caption">Via the new tab “Administration” tab on tab on the literature editing page, the location and access context of the literature entry can be linked. This can be useful if the literature module is also used to manage the museum library. is also used to manage the museum library .</figcaption></figure>



<h3 class="wp-block-heading">Bugfixes</h3>



<ul class="wp-block-list">
<li>Overlay for setting searches for objects: Multi-word search terms were converted into multiple searches instead of being searched as a string of words (“red helmet” &gt; “red” AND “helmet” instead of “red helmet”)</li>



<li>Error when searching for controlled list terms that contained multiple spaces via the “Refine search” overlay (search for license “Public Domain Mark”)</li>
</ul>



<h2 class="wp-block-heading"><a href="https://blog.museum-digital.org/de/category/technik-design/importer-de/">Importer</a></h2>



<ul class="wp-block-list">
<li>Link between literature and spaces (actual location) as well as acquisitions is implemented in the &#8220;core&#8221; of the import tool</li>



<li>ImageByInvno parser (assignment of images to objects via inventory numbers contained in the file name) can now be used to import PDF files</li>
</ul>



<h2 class="wp-block-heading"><a href="https://files.museum-digital.org/">files.museum-digital.org</a></h2>



<ul class="wp-block-list">
<li>Added a small script to enhance PDF metadata based on an XML sidecar file. See e.g.: <a href="https://files.museum-digital.org/de/Praesentationen/2024-10-18_md-deutschland-eV-stellt-sich-vor_Usertreffen_MA.xml">https://files.museum-digital.org/de/Praesentationen/2024-10-18_md-deutschland-eV-stellt-sich-vor_Usertreffen_MA.xml</a></li>
</ul>



<hr class="wp-block-separator has-alpha-channel-opacity"/>



<p>Main post image generated using illustriousXL_smoothftSPO</p>



<hr class="wp-block-separator has-alpha-channel-opacity"/>



<div class="wp-block-cgb-cc-by message-body" style="background-color:white;color:black"><img loading="lazy" decoding="async" src="https://blog.museum-digital.org/wp-content/plugins/creative-commons/includes/images/by.png" alt="CC" width="88" height="31"/><p><span class="cc-cgb-name">This content</span> is licensed under a <a href="https://creativecommons.org/licenses/by/4.0/">Creative Commons Attribution 4.0 International license.</a> <span class="cc-cgb-text"></span></p></div>
]]></content:encoded>
					
		
		
		<post-thumbnail><url>https://blog.museum-digital.org/wp-content/uploads/2024/11/banner.png.avif</url><width>600</width><height>336</height></post-thumbnail>	</item>
		<item>
		<title>Who is actually using musdb? And what for?</title>
		<link>https://blog.museum-digital.org/2024/06/16/who-is-actually-using-musdb-and-what-for/</link>
		
		<dc:creator><![CDATA[Joshua Ramon Enslin]]></dc:creator>
		<pubDate>Sun, 16 Jun 2024 15:28:18 +0000</pubDate>
				<category><![CDATA[Development]]></category>
		<category><![CDATA[musdb]]></category>
		<category><![CDATA[New Features]]></category>
		<category><![CDATA[Statistics]]></category>
		<guid isPermaLink="false">https://blog.museum-digital.org/?p=4143</guid>

					<description><![CDATA[In its most recently published survey of museums in Germany the Institute for Museum Research (Berlin) asked how many museums use controlled vocabularies and norm data. 416 of the 3059 museums who answered the additional question sheet with this particular question answered that they do indeed use norm data. The survey concerns German museums as <a href="https://blog.museum-digital.org/2024/06/16/who-is-actually-using-musdb-and-what-for/" class="more-link">...</a>]]></description>
										<content:encoded><![CDATA[
<p>In its most <a href="https://doi.org/10.11588/ifmzm.2023.1">recently published survey of museums in Germany</a> the Institute for Museum Research (Berlin) asked how many museums use controlled vocabularies and norm data. 416 of the 3059 museums who answered the additional question sheet with this particular question answered that they do indeed use norm data. The survey concerns German museums as of 2021.</p>



<p>On January 1st, 2021, 866 museums and similar institutions were registered with museum-digital in Germany. 688 of these were publicly listed, which means that they had recorded and published at least one object entry. The recording of object metadata in musdb is barely possible without using controlled vocabularies (in turn linked to the large norm data catalogues like the Gemeinsame Normdatei, the Library of Congress Subject Headings or Geonames), unless one really only fills out the most basic description fields. Which is to say that 688 museums in Germany were almost certainly using norm data in 2021. An important assumption in the design of musdb is that norm data adoption is most successful if users benefit from it &#8211; ideally &#8211; without even knowing that they use it.</p>



<p><em>Maybe it is exactly for that reason, but something clearly is wrong about our numbers.</em></p>



<p>Aside from questions on a mainly statistically and politically interesting global / national level, questions about the participating institutions pop up regularly in more practically relevant situations. Museums considering musdb as a collection management solution ask how many other museums use musdb for collection management. As it is both a collection management system and the gateway to the publication of museum data via museum-digital, we cannot yet answer those questions safely beyond the better known case studies.</p>



<p>At least for its background, museum-digital in Germany is known to be primarily used by smaller museums. Knowing the German museum landscape, this is pretty much true in terms of collection management, and less so where it comes to the publication of museum collection data. But there is no data to give a more detailed assessment. Given the approach of a centralized vocabulary management with centrally determined names for actors, places, times and tags, it would be likely to assume that art history museums are less likely to use museum-digital than other types of museums. As of now, we cannot confirm or deny that.</p>



<p><em>Getting better data would be very valuable for linking like-minded museums and gaining a better understanding of who is actually using museum-digital and/or musdb.</em></p>



<p>At museum-digital, we know details about the participating museums, but there is a lack of an overview. And details can obscure an overarching perspective (if a museum has only published its numismatic collection via museum-digital but is actually collecting all types of objects, it would clearly be wrong to deduce that is a numismatic collection based on its published objects).</p>



<p><em>Overarching statistical information about institutions can hence not be safely deduced from the published collections. It is information that the colleagues at these institutions know best themselves.</em></p>



<h2 class="wp-block-heading">A Small Survey</h2>



<p>Given this background, it is time that we&#8217;d try to learn some more about the institutions using musdb. And that clearly will work best by asking people themselves.</p>



<p>Starting today, users logging in to musdb will hence be asked to fill out a very quick survey of six questions:</p>



<ul class="wp-block-list">
<li><em>Please select the type of your institution.</em> (museum, archive, &#8230;)</li>



<li><em>Is your institution a privately-run or a public institution?</em></li>



<li><em>What does your institution use museum-digital for?</em> (Collection management, publication, both)</li>



<li><em>How many objects comprise the museum&#8217;s collections (estimate)?</em></li>



<li><em>How large is your exhibition space (sqm)?</em></li>



<li><em>Please select the main topical foci of your collection.</em></li>
</ul>



<p>The dialogue asking users to fill out this survey will only appear upon logins until the survey has been filled out for the museum.</p>



<h2 class="wp-block-heading">What will the data be used for?</h2>



<p>For the start, the data will only be used for answering questions such as the ones stated above. Knowing the intended use of musdb and the main topical foci of a museum however also opens up new possibilities in making musdb easier to use, especially for new users. This information may e.g. be used to set more defaults for new users that are tailored towards the intended use case for musdb instead of providing only a single set of defaults for a given functionality.</p>
]]></content:encoded>
					
		
		
		<post-thumbnail><url>https://blog.museum-digital.org/wp-content/uploads/2024/06/20240616_Screenshot-musdb-Self-categorization-scaled.avif</url><width>600</width><height>338</height></post-thumbnail>	</item>
		<item>
		<title>Automatically enforcing consistent naming of places</title>
		<link>https://blog.museum-digital.org/2023/11/27/automatically-enforcing-consistent-naming-of-places/</link>
		
		<dc:creator><![CDATA[Joshua Ramon Enslin]]></dc:creator>
		<pubDate>Mon, 27 Nov 2023 14:10:12 +0000</pubDate>
				<category><![CDATA[Development]]></category>
		<category><![CDATA[Importer]]></category>
		<category><![CDATA[musdb]]></category>
		<category><![CDATA[Autocorrection]]></category>
		<category><![CDATA[Controlled Vocabularies]]></category>
		<category><![CDATA[New Features]]></category>
		<guid isPermaLink="false">https://blog.museum-digital.org/?p=3978</guid>

					<description><![CDATA[Last week I wrote about how new actors find their way into museum-digital&#8217;s controlled vocabulary for actors during imports. One of the first steps detailed in the post is the automatic cleanup of the actor&#8217;s name and the application of some rules to ensure a consistent naming of actors. For time names a much more <a href="https://blog.museum-digital.org/2023/11/27/automatically-enforcing-consistent-naming-of-places/" class="more-link">...</a>]]></description>
										<content:encoded><![CDATA[
<p>Last week I wrote about how new actors find <a href="https://blog.museum-digital.org/2023/11/22/importing-actors/">their way into museum-digital&#8217;s controlled vocabulary for actors during imports</a>. One of the first steps detailed in the post is the automatic cleanup of the actor&#8217;s name and the application of some rules to ensure a consistent naming of actors.</p>



<p>For time names a much more extensive cleaning is done, both in the case of imports and when working directly in musdb. Time names in the sense of the controlled vocabulary of museum-digital describe clearly defined times. This may be timespans, years, or full dates. But the number of possible input values is thus already significantly limited. In fact, it is small enough, and timespans, years, etc. are uniform enough to allow the automatic parsing of time names. Where that is possible, they are automatically rewritten to their default form in the controlled vocabulary and automatically translated to some 30 languages.</p>



<p>In the case of place names a similar cleaning and consolidating of place names was limited to the simple stripping of leading and trailing spaces and commas and the removal of indicators for uncertainty (e.g. trailing question marks). Since the weekend, place names are more extensively rewritten to ease the work of the vocabulary editors and ensure a more immediately consistent data entry both in musdb and when importing.</p>



<h2 class="wp-block-heading">General rewriting</h2>



<h3 class="wp-block-heading">1. Simple spelling issues</h3>



<p>The very first step in rewriting entered place names remains the removal of superfluous characters. As such, duplicate white spaces, trailing commas, etc. are removed.</p>



<p><strong>Example</strong>: <code>, Berlin ,</code> &gt; <code>Berlin</code></p>



<h3 class="wp-block-heading">2. Removal of indicators for uncertainty</h3>



<p>Next, known indicators for uncertainty are used to update the dedicated flag describing the certainty of the link to the place and then stripped. As they indicate the relation between object and place, they are not actually part of the place name and can thus be removed safely.</p>



<p><strong>Example 1</strong>: <code>Berlin ?</code> &gt; <code>Berlin</code><br><strong>Example 2</strong>: <code>Maybe Budapest</code> &gt; <code>Budapest</code></p>



<h3 class="wp-block-heading">3. Removal of duplicates in enumerations</h3>



<p>Commas are used in input place names in two ways. On the one hand, they may be used to further specify a place name (<code>Beijing, PRC</code>), on the other, they are often used in import data to designate multiple space names at a time (<code>Beijing, Tokyo, Nairobi</code>; this contradicts the logic of a database altogether and needs to be cleaned up manually be the vocabulary editors).</p>



<p>In both cases, duplicate names in the enumeration are superfluous and can be removed.</p>



<p><strong>Example 1</strong>: <code>Berlin, Germany, Germany</code> &gt; <code>Berlin, Germany</code></p>



<h3 class="wp-block-heading">4. Language-dependent rewriting</h3>



<p>The next steps in rewriting entries depend on the language of the entry. In musdb, the language the user set to use musdb is used to guess the language they are entering data in. In the case of imports, the default language of the given instance of museum-digital that a museum imports to is used.</p>



<h4 class="wp-block-heading">4.1. Extension of common abbreviations</h4>



<p>Where there are abbreviations, there are also unabbreviated names. And surely, both will be used. Will inevitably leads to duplicate entries. In the case of some common abbreviations, they are thus automatically rewritten to a canonical form.</p>



<p><strong>Example (German)</strong>: <code>Adalberthstr. 12 (Berlin)</code> &gt; <code>Adalbertstraße 12 (Berlin)</code><br><strong>Example (Hungarian)</strong>: <code>Vaci u. 12 (Budapest)</code> &gt; <code>Vaci utca 12 (Budapest)</code></p>



<h4 class="wp-block-heading">4.2. Reordering names in commas based on indicators for more specific place names</h4>



<p>As stated above (3.), commas may either indicate a specification of a single place, or they may indicate that the entry actually refers to more than one place. Some components of a name can be used to almost certainly determine that the former is the case &#8211; and which place of the given list is the specific one and which one is a superordinate named mainly for clarification. Common such names are &#8220;street&#8221;, &#8220;plaza&#8221;, &#8220;pier&#8221;.</p>



<p>If there is exactly one comma and such a name component is encountered, the entered place name can be rewritten to contain the less specific name only in brackets.</p>



<p><strong>Example (German)</strong>: <code>Berlin, Adalberthstraße 12</code> &gt; <code>Adalbertstraße 12 (Berlin)</code><br><strong>Example (Hungarian)</strong>: <code>Vaci utca 12, Budapest</code> &gt; <code>Vaci utca 12 (Budapest)</code></p>



<p>If both names contain such an indicator, no rewriting is applied. <code>Vaci utca 12, Vaci utca 13</code>&#8216; is thus not rewritten.</p>



<h4 class="wp-block-heading">4.3. Budapest special: Extending the names of districts</h4>



<p>Street names in Budapest are usually referred to including the naming of the district. These districts are referred to in a number of ways. If they are referred to using roman numerals, this is automatically extended to the canonical form.<br>This rewrite is only applied if the language is set to Hungarian.</p>



<p><strong>Example</strong>: <code>Petőfi Sándor utca 3. Budapest, IV.</code> &gt; <code>Petőfi Sándor utca 3. (Budapest, 4. kerület)</code></p>



<h4 class="wp-block-heading">4.4. Reordering names in commas based on country names</h4>



<p>Similar to the rewrites described in 4.2., country names can be used to indicate a hierarchical relationship between two places in a comma-separated list. If one given name is a country name and the other is not, it is likely that the non-country name is part of the given country. The comma can be replaced with brackets while the name can be reordered into the preferred <code>specific (unspecific)</code> form.<br>This check is also applied to names separated by hyphens.</p>



<p><strong>Example (German)</strong>: <code>Budapest, Ungarn</code> &gt; <code>Budapest (Ungarn)</code><br><strong>Example (Hungarian)</strong>: <code>Berlin-Németország</code> &gt; <code>Berlin (Németország)</code></p>



<p>There are however some common cases in which this logic does not apply &#8211; significantly cardinal directions. If one such term is found to be the non-country part, the rewrite is not applied. As such <code>West-Deutschland</code> remains <code>West-Deutschland</code> without being rewritten.</p>



<p>The list of names of countries and historical countries used stems from Wikidata (thanks!).</p>



<h2 class="wp-block-heading">Where do these rewrites apply?</h2>



<p>The rewrites listed above are now implemented in musdb, the import tool, and nodac.</p>



<p>They have also been used to consolidate existing place names, allowing us to identify some 500 duplicate place entries over the weekend (that amounts to almost 0.7 percent of the whole vocabulary). Clearly, identifying similar cases of regularly appearing, varied ways to express the same thing, and determining a canonical way of naming places in such cases holds a lot of potential for reducing the editors workload and improving data quality for everybody.</p>



<p></p>
]]></content:encoded>
					
		
		
			</item>
		<item>
		<title>Quality Assessments Like in musdb: Now For Everybody</title>
		<link>https://blog.museum-digital.org/2023/10/12/quality-assessments-like-in-musdb-now-for-everybody/</link>
					<comments>https://blog.museum-digital.org/2023/10/12/quality-assessments-like-in-musdb-now-for-everybody/#comments</comments>
		
		<dc:creator><![CDATA[Joshua Ramon Enslin]]></dc:creator>
		<pubDate>Thu, 12 Oct 2023 00:55:29 +0000</pubDate>
				<category><![CDATA[Development]]></category>
		<category><![CDATA[musdb]]></category>
		<category><![CDATA[museum-digital:qa]]></category>
		<category><![CDATA[AG Minimaldatensatz]]></category>
		<category><![CDATA[Data quality]]></category>
		<category><![CDATA[New Features]]></category>
		<category><![CDATA[Plaubibility checks]]></category>
		<category><![CDATA[Publication quality index (PuQI)]]></category>
		<guid isPermaLink="false">https://blog.museum-digital.org/?p=3878</guid>

					<description><![CDATA[At yesterday&#8217;s Autumn Conference of the Working Group Documentation of the German Museum Association (Herbsttagung der Fachgruppe Dokumentation des Deutschen Museumsbunds) a new web service in the broader realm of museum-digital was released: museum-digital:qa. museum-digital:qa reuses the importer&#8216;s relevant functionalities to accept museum object data in a variety of input formats &#8211; both open standards <a href="https://blog.museum-digital.org/2023/10/12/quality-assessments-like-in-musdb-now-for-everybody/" class="more-link">...</a>]]></description>
										<content:encoded><![CDATA[
<p>At yesterday&#8217;s <a href="https://web.archive.org/web/20240125125045/https://www.museumsbund.de/fachgruppe-dokumentation/veranstaltungsarchiv/herbsttagung-2023-der-fg-dokumentation/">Autumn Conference of the Working Group Documentation of the German Museum Association</a> (Herbsttagung der Fachgruppe Dokumentation des Deutschen Museumsbunds) a new web service in the broader realm of museum-digital was released: <a href="https://quality.museum-digital.org/">museum-digital:qa</a>.</p>



<p>museum-digital:qa reuses the <a href="https://blog.museum-digital.org/category/development/importer-en/">importer</a>&#8216;s relevant functionalities to accept museum object data in a variety of input formats &#8211; both open standards and software-specific export formats &#8211; and allows running different tools concerning data quality on the uploaded data.This way we can make the quality checking functionalities that have thus far only been available as part of musdb accessible to museums almost regardless of the collection management system the museum uses.</p>



<p>At the same time, the tool can be extended with further checks that are not (yet?) built into musdb itself. As a first such check, museum-digital:qa can be used to check the object records&#8217; conformity to the field set suggested for a minimum viable object record for publication of the working group of the same name (German: AG Minimaldatensatz).</p>



<p>While aspects of the import tool are reused extensively, no actual import takes place. On the other hand, this reuse means that museum-digital:qa will automatically support yet more input formats as more data of different sources can be imported to museum-digital, while actual maintenance work on museum-digital:qa should remain minimal.</p>



<p>P.S.: A long awaited check &#8211; checking the plausibility of the stated license status of images and media files linked to an object based on its creator / creation &#8211; has now been integrated into musdb as well as md:qa.</p>
]]></content:encoded>
					
					<wfw:commentRss>https://blog.museum-digital.org/2023/10/12/quality-assessments-like-in-musdb-now-for-everybody/feed/</wfw:commentRss>
			<slash:comments>1</slash:comments>
		
		
			</item>
		<item>
		<title>EODEM Version 1.0 released</title>
		<link>https://blog.museum-digital.org/2023/09/04/eodem-version-1-0-released/</link>
					<comments>https://blog.museum-digital.org/2023/09/04/eodem-version-1-0-released/#respond</comments>
		
		<dc:creator><![CDATA[Joshua Ramon Enslin]]></dc:creator>
		<pubDate>Mon, 04 Sep 2023 15:19:31 +0000</pubDate>
				<category><![CDATA[Development]]></category>
		<category><![CDATA[General]]></category>
		<category><![CDATA[musdb]]></category>
		<category><![CDATA[EODEM]]></category>
		<category><![CDATA[LIDO]]></category>
		<category><![CDATA[Loan management]]></category>
		<guid isPermaLink="false">https://blog.museum-digital.org/?p=3846</guid>

					<description><![CDATA[Since September 1, 2023, the first stable version of EODEM has been released. EODEM is implemented as a LIDO profile and aims to enable museums to share their object data &#8211; especially in the contexts of loans and exhibitions &#8211; with other museums at the click of a button. Congratulations! museum-digital:musdb has supported EODEM since <a href="https://blog.museum-digital.org/2023/09/04/eodem-version-1-0-released/" class="more-link">...</a>]]></description>
										<content:encoded><![CDATA[
<p>Since September 1, 2023, the first stable version of EODEM has been released. EODEM is implemented as a LIDO profile and aims to enable museums to share their object data &#8211; especially in the contexts of loans and exhibitions &#8211; with other museums at the click of a button. Congratulations!</p>



<p>museum-digital:<a href="https://en.about.museum-digital.org/software/musdb/">musdb</a> has supported EODEM since February. But the usefulness of a standard is determined by how widely it is adopted. Thankfully, there is now a second collection management system to implement EODEM: Zetcom&#8217;s museumPlus.</p>



<p>You can learn more about EODEM on the <a href="https://cidoc.mini.icom.museum/working-groups/documentation-standards/eodem-home/">project&#8217;s website</a>. See also the very insightful <a href="https://rupertshepherd.info/documentation/eodem-update-8">notes</a> from the project&#8217;s co-coordinator Rupert Shepherd.</p>
]]></content:encoded>
					
					<wfw:commentRss>https://blog.museum-digital.org/2023/09/04/eodem-version-1-0-released/feed/</wfw:commentRss>
			<slash:comments>0</slash:comments>
		
		
		<post-thumbnail><url>https://blog.museum-digital.org/wp-content/uploads/2023/09/EODEM_logo_standard.jpg</url><width>600</width><height>132</height></post-thumbnail>	</item>
	</channel>
</rss>
