[{"content":"The message in the infrastructure team\u0026rsquo;s Slack channel was the kind that makes you look up from your screen: \u0026ldquo;Disk at 95% on the production db. Anyone able to look?\u0026rdquo;\nThe server was a MySQL 8.0 on Rocky Linux, a business management system used by about a hundred users. The database itself was around 40 GB — nothing extraordinary. But in the data directory there were 180 GB of binary logs. Six months\u0026rsquo; worth of binlogs that nobody had ever thought to manage.\nIt\u0026rsquo;s not the first time I\u0026rsquo;ve seen this scenario. In fact, I\u0026rsquo;d say it\u0026rsquo;s one of the most recurring patterns in the tickets I receive. The binary log is one of those MySQL features that works silently, asking nothing — until the disk fills up.\nWhat binary logs actually are #The binary log is a sequential record of all events that modify data in the database. Every INSERT, UPDATE, DELETE, every DDL — everything gets written to sequentially numbered binary files: mysql-bin.000001, mysql-bin.000002 and so on.\nThe name is a bit misleading. It\u0026rsquo;s not a \u0026ldquo;log\u0026rdquo; in the syslog or error log sense — it\u0026rsquo;s not meant to be read by a human. It\u0026rsquo;s a structured binary stream that MySQL uses internally for two fundamental purposes:\nReplication: the slave reads the master\u0026rsquo;s binlogs to replicate the same operations Point-in-time recovery (PITR) : after restoring a backup, you can \u0026ldquo;replay\u0026rdquo; the binlogs to bring data up to a precise moment Without the binary log, you can\u0026rsquo;t do either. That\u0026rsquo;s why the first instinct — \u0026ldquo;let\u0026rsquo;s disable binlogs so they don\u0026rsquo;t fill up the disk\u0026rdquo; — is almost always wrong.\nHow MySQL generates binlogs #Binary logging is enabled through the log_bin parameter. From MySQL 8.0 it\u0026rsquo;s enabled by default — an important change from previous versions where you had to activate it explicitly.\n[mysqld] log_bin = /var/lib/mysql/mysql-bin server-id = 1 MySQL creates a new binlog file under several circumstances:\nWhen the server starts or restarts When the current file reaches the size defined by max_binlog_size (default: 1 GB) When you run FLUSH BINARY LOGS When a manual rotation occurs Each binlog file has an associated index file (mysql-bin.index) that tracks all active binlog files. This file is critical: if you corrupt it or edit it by hand, MySQL no longer knows which binlogs exist.\nSHOW BINARY LOGS; +------------------+-----------+ | Log_name | File_size | +------------------+-----------+ | mysql-bin.000147 | 1073741824| | mysql-bin.000148 | 1073741824| | mysql-bin.000149 | 1073741824| | ... | | | mysql-bin.000318 | 524288000| +------------------+-----------+ 172 rows in set A hundred and seventy-two files. Each about a gigabyte. The maths checks out: 180 GB of binlogs never purged.\nThe role in replication #In a master-slave architecture, the binary log is the data transport mechanism. The flow goes like this:\nThe master writes every transaction to the binlog The slave has a thread (I/O thread) that connects to the master and reads the binlogs The slave writes what it receives into its own relay log A second thread (SQL thread) on the slave executes events from the relay log This means the binlogs on the master must remain available until all slaves have read them. If you delete a binlog that the slave hasn\u0026rsquo;t consumed yet, replication breaks.\nBefore touching any binlog on a master, the command to run is:\nSHOW REPLICA STATUS\\G -- or, on older versions: SHOW SLAVE STATUS\\G The field you care about is Relay_Master_Log_File (or Source_Log_File in recent versions): it tells you which binlog the slave is currently reading. All files before that one are safe to remove.\nPoint-in-time recovery: the other reason binlogs exist #The second use — often underestimated — is point-in-time recovery. The scenario goes like this: you have a backup taken at 3 AM. At 2:30 PM someone runs a wrong DROP TABLE. Without binlogs, you restore the backup and lose everything that happened between 3:00 AM and 2:30 PM. With binlogs, you do the restore and then replay the binlogs up to 2:29 PM.\n# Find the DROP TABLE event mysqlbinlog --start-datetime=\u0026#34;2026-03-30 14:00:00\u0026#34; \\ --stop-datetime=\u0026#34;2026-03-30 15:00:00\u0026#34; \\ /var/lib/mysql/mysql-bin.000318 | grep -i \u0026#34;DROP\u0026#34; # Replay binlogs up to the moment before the disaster mysqlbinlog --stop-datetime=\u0026#34;2026-03-30 14:29:00\u0026#34; \\ /var/lib/mysql/mysql-bin.000310 \\ /var/lib/mysql/mysql-bin.000311 \\ ... \\ /var/lib/mysql/mysql-bin.000318 | mysql -u root -p In practice, binlogs are your insurance policy. The backup is the foundation, binlogs cover the delta. Deleting binlogs without a recent backup is like cancelling your insurance the day before a storm.\nPURGE BINARY LOGS: the right way to clean up #Back to our server with disk at 95%. The temptation to just do rm -f mysql-bin.* is strong. But it\u0026rsquo;s wrong, for two reasons:\nMySQL doesn\u0026rsquo;t know you deleted the files — the index file still points to binlogs that no longer exist If there\u0026rsquo;s active replication, you risk breaking synchronisation The correct way is the PURGE command:\n-- Remove all binlogs before a specific file PURGE BINARY LOGS TO \u0026#39;mysql-bin.000300\u0026#39;; -- Or, remove all binlogs older than a specific date PURGE BINARY LOGS BEFORE \u0026#39;2026-03-01 00:00:00\u0026#39;; PURGE does three things that rm doesn\u0026rsquo;t:\nUpdates the index file Checks that the files aren\u0026rsquo;t needed by replication (in theory — but you should check first) Removes files in an orderly manner In our server\u0026rsquo;s case, I first verified there were no slaves:\nSHOW REPLICAS; -- Empty set No replication. Then I checked which binlog was current:\nSHOW MASTER STATUS; +------------------+----------+ | File | Position | +------------------+----------+ | mysql-bin.000318 | 52428800 | +------------------+----------+ Keeping the last 3 files for safety:\nPURGE BINARY LOGS TO \u0026#39;mysql-bin.000316\u0026#39;; Result: 175 GB freed in a few seconds. Disk usage dropped from 95% to 28%.\nConfiguring automatic retention #Solving the emergency is one thing. Making sure it doesn\u0026rsquo;t happen again is another. MySQL offers two parameters for automatic retention management:\nexpire_logs_days (legacy) #[mysqld] expire_logs_days = 14 Automatically removes binlogs older than 14 days. Simple but coarse — granularity is in days only.\nbinlog_expire_logs_seconds (MySQL 8.0+) #[mysqld] binlog_expire_logs_seconds = 1209600 # 14 days in seconds Same logic, but with per-second granularity. From MySQL 8.0, this parameter takes priority over expire_logs_days. If you set both, binlog_expire_logs_seconds wins.\nThe question I always get asked is: \u0026ldquo;How many days of retention?\u0026rdquo;\nIt depends. But here are my practical rules:\nScenario Recommended retention Standalone server, daily backup 7 days Master with replica, daily backup 7-14 days Master with slow or remote replica 14-30 days Regulated environments (finance, healthcare) 30-90 days, with archiving The principle is: binlog retention must cover at least twice the interval between two backups. If you back up every night, keep at least 2-3 days of binlogs. If you do weekly backups, at least 14 days.\nIn our server\u0026rsquo;s case, no retention had been set. The MySQL 8.0 default is 30 days — but that value had been overridden to 0 (no expiry) in a custom my.cnf by someone who \u0026ldquo;wanted to keep everything for safety\u0026rdquo;. The irony: the safety they wanted to guarantee was about to crash the server by filling up the disk.\nThe three binlog formats: STATEMENT, ROW, MIXED #Not all binlogs are created equal. MySQL supports three recording formats, and the choice has real implications.\nSTATEMENT #Records the SQL statement as it was executed. Compact, readable, but problematic: functions like NOW(), UUID(), RAND() produce different results on master and slave. Queries with LIMIT without ORDER BY can produce non-deterministic results.\nSET binlog_format = \u0026#39;STATEMENT\u0026#39;; ROW #Records the change at row level — before and after. Heavier in terms of space, but 100% deterministic. If you update 10,000 rows, the binlog contains 10,000 before/after images. Large, but safe.\nSET binlog_format = \u0026#39;ROW\u0026#39;; MIXED #MySQL decides case by case: uses STATEMENT when it\u0026rsquo;s safe, automatically switches to ROW when it detects non-deterministic operations.\nSET binlog_format = \u0026#39;MIXED\u0026#39;; My advice: use ROW. It\u0026rsquo;s been the default since MySQL 5.7.7, it\u0026rsquo;s what Galera Cluster requires, it\u0026rsquo;s what all modern replication tools expect. STATEMENT is a legacy from the past, MIXED is a compromise that adds complexity without real benefit.\nThe only case where ROW becomes a problem is when you do massive operations — an UPDATE on millions of rows generates a huge binlog because it contains the before and after of every row. In those cases, the solution isn\u0026rsquo;t to change format, but to break the operation into batches:\n-- Instead of this (generates a massive binlog): UPDATE orders SET status = \u0026#39;archived\u0026#39; WHERE order_date \u0026lt; \u0026#39;2025-01-01\u0026#39;; -- Better like this (batches of 10,000): UPDATE orders SET status = \u0026#39;archived\u0026#39; WHERE order_date \u0026lt; \u0026#39;2025-01-01\u0026#39; AND status != \u0026#39;archived\u0026#39; LIMIT 10000; -- Repeat until 0 rows affected mysqlbinlog: reading binlogs when you need to #The `mysqlbinlog` command-line tool is the only way to inspect the contents of binlog files. It\u0026rsquo;s used in two scenarios: debugging replication problems and point-in-time recovery.\n# Read a binlog in human-readable format mysqlbinlog /var/lib/mysql/mysql-bin.000318 # Filter by time range mysqlbinlog --start-datetime=\u0026#34;2026-03-30 10:00:00\u0026#34; \\ --stop-datetime=\u0026#34;2026-03-30 11:00:00\u0026#34; \\ /var/lib/mysql/mysql-bin.000318 # Filter by specific database mysqlbinlog --database=gestionale /var/lib/mysql/mysql-bin.000318 # If format is ROW, decode into readable SQL mysqlbinlog --verbose /var/lib/mysql/mysql-bin.000318 With ROW format, without --verbose you\u0026rsquo;ll only see binary blobs. With --verbose you get rows in commented pseudo-SQL format — not pretty, but readable.\nThe principle: manage binlogs, don\u0026rsquo;t disable them #Every now and then someone suggests solving the problem \u0026ldquo;at the root\u0026rdquo; by disabling binlogs:\n# DO NOT DO THIS in production skip-log-bin Yes, it solves the disk problem. But it eliminates:\nThe ability to set up replication in the future Point-in-time recovery The ability to analyse what happened in the database after an incident Compatibility with CDC (Change Data Capture) tools like Debezium Binlogs are not a problem. Unmanaged binlogs are a problem. The difference is a configuration parameter and a weekly check. On the server I fixed, the final configuration was:\n[mysqld] log_bin = /var/lib/mysql/mysql-bin server-id = 1 binlog_format = ROW binlog_expire_logs_seconds = 604800 # 7 days max_binlog_size = 512M A max_binlog_size of 512 MB instead of the default 1 GB — smaller files are easier to manage, transfer and purge. Seven-day retention with daily backup ensures complete PITR coverage with predictable disk usage.\nPost-intervention check-up #Before closing the ticket, I added a couple of queries to the client\u0026rsquo;s monitoring system:\n-- Space used by binlogs SELECT COUNT(*) AS num_files, ROUND(SUM(file_size) / 1024 / 1024 / 1024, 2) AS total_gb FROM information_schema.BINARY_LOGS; -- MySQL 8.0+ / Performance Schema -- Or, for all versions: SHOW BINARY LOGS; -- and sum manually or via script # Alert if binlogs exceed 20 GB #!/bin/bash BINLOG_SIZE=$(mysql -u monitor -p\u0026#39;pwd\u0026#39; -Bse \\ \u0026#34;SELECT ROUND(SUM(file_size)/1024/1024/1024,2) FROM performance_schema.binary_log_status\u0026#34; 2\u0026gt;/dev/null) # Fallback for versions without performance_schema.binary_log_status if [ -z \u0026#34;$BINLOG_SIZE\u0026#34; ]; then BINLOG_SIZE=$(du -sh /var/lib/mysql/mysql-bin.* 2\u0026gt;/dev/null | \\ awk \u0026#39;{sum+=$1} END {printf \u0026#34;%.2f\u0026#34;, sum/1024}\u0026#39;) fi if (( $(echo \u0026#34;$BINLOG_SIZE \u0026gt; 20\u0026#34; | bc -l) )); then echo \u0026#34;WARNING: binlog size ${BINLOG_SIZE} GB\u0026#34; fi Three weeks after the intervention, the binlogs were using 8 GB — exactly within the predicted window. The disk never went above 45%.\nThe binlog is like engine oil: you never think about it until the warning light comes on. The difference is that the engine warns you. MySQL doesn\u0026rsquo;t — it keeps writing binlogs as long as the filesystem responds. When it stops responding, it\u0026rsquo;s too late to wonder why you hadn\u0026rsquo;t set up retention.\nGlossary #Binary log — MySQL\u0026rsquo;s sequential binary record that tracks all data modifications (INSERT, UPDATE, DELETE, DDL), used for replication and point-in-time recovery. Enabled by default since MySQL 8.0.\nPITR — Point-in-Time Recovery: a restore technique that combines a full backup with binary logs to bring the database back to any moment in time, not just the backup time.\nRelay log — Intermediate log file on a MySQL slave that receives events from the master\u0026rsquo;s binary log before they are executed locally by the SQL thread.\nCDC — Change Data Capture: a technique for intercepting data changes in real time by reading transaction logs. Tools like Debezium read MySQL binary logs to propagate changes to external systems.\nmysqlbinlog — MySQL command-line utility for reading, filtering and replaying the contents of binary log files. Essential for point-in-time recovery and replication debugging.\n","date":"31 March 2026","permalink":"https://ivanluminaria.com/en/posts/mysql/binary-log-mysql/","section":"Database Strategy","summary":"\u003cp\u003eThe message in the infrastructure team\u0026rsquo;s Slack channel was the kind that makes you look up from your screen: \u0026ldquo;Disk at 95% on the production db. Anyone able to look?\u0026rdquo;\u003c/p\u003e\n\u003cp\u003eThe server was a MySQL 8.0 on Rocky Linux, a business management system used by about a hundred users. The database itself was around 40 GB — nothing extraordinary. But in the data directory there were 180 GB of binary logs. Six months\u0026rsquo; worth of binlogs that nobody had ever thought to manage.\u003c/p\u003e","title":"Binary logs in MySQL: what they are, how to manage them, and when you can delete them"},{"content":"","date":null,"permalink":"https://ivanluminaria.com/en/tags/binlog/","section":"Tags","summary":"","title":"Binlog"},{"content":"","date":null,"permalink":"https://ivanluminaria.com/en/categories/","section":"Categories","summary":"","title":"Categories"},{"content":" The difference between a system that works\nand one that truly drives business is not luck.\nIt is deep understanding of execution plans.\nIt is control over privileges and data security.\nIt is data modeling aligned with business objectives.\nIt is performance that holds under growing load.\nDatabases are the operational core of every modern digital ecosystem.\nThey support critical processes, enable data-driven decisions and determine operational speed and efficiency.\nInside the Engine is where I analyze what happens under the hood of PostgreSQL, Oracle and MySQL: performance tuning, security, architecture and technical decisions applicable to real-world production systems.\nBecause in today’s data-driven world, databases are not just software components.\nThey are strategic assets that influence competitiveness, reliability and sustainable growth.\n","date":null,"permalink":"https://ivanluminaria.com/en/posts/","section":"Database Strategy","summary":"\u003cblockquote\u003e\n\u003cp\u003eThe difference between a system that works\u003cbr\u003e\nand one that truly drives business is not luck.\u003c/p\u003e\n\u003c/blockquote\u003e\n\u003cp\u003eIt is deep understanding of execution plans.\u003cbr\u003e\nIt is control over privileges and data security.\u003cbr\u003e\nIt is data modeling aligned with business objectives.\u003cbr\u003e\nIt is performance that holds under growing load.\u003cbr\u003e\u003c/p\u003e\n\u003cp\u003eDatabases are the operational core of every modern digital ecosystem.\u003cbr\u003e\nThey support critical processes, enable data-driven decisions and determine operational speed and efficiency.\u003cbr\u003e\u003c/p\u003e","title":"Database Strategy"},{"content":"","date":null,"permalink":"https://ivanluminaria.com/en/tags/disk-space/","section":"Tags","summary":"","title":"Disk-Space"},{"content":"","date":null,"permalink":"https://ivanluminaria.com/en/","section":"Ivan Luminaria","summary":"","title":"Ivan Luminaria"},{"content":"","date":null,"permalink":"https://ivanluminaria.com/en/tags/mariadb/","section":"Tags","summary":"","title":"Mariadb"},{"content":"","date":null,"permalink":"https://ivanluminaria.com/en/categories/mysql/","section":"Categories","summary":"","title":"Mysql"},{"content":"MySQL is the database that needs no introduction.\nIt is the engine that powered the growth of the web for over twenty years.\nBorn in 1995 in Sweden, in 2008 it was acquired by Sun Microsystems — and when Oracle completed its acquisition of Sun in 2010, MySQL ended up in the portfolio of the world\u0026rsquo;s largest commercial database vendor. I was an Oracle employee at the time, and I remember the atmosphere well: on one hand the curiosity of seeing how Oracle would manage such a popular open source product, on the other the concern that MySQL would be sidelined in favour of the proprietary database.\nThat concern drove Michael \u0026ldquo;Monty\u0026rdquo; Widenius — MySQL\u0026rsquo;s original creator — to fork the project in 2009, giving birth to MariaDB. A project that shares its roots with MySQL but has taken its own path on storage engines, optimizer and advanced features.\nHistory has shown that both projects survived and evolved, but their architectural choices diverge more and more. Knowing the differences is not academic — it is an operational necessity.\nIn this section I explore MySQL and MariaDB from an operational perspective: security, user management, performance and design decisions that make a difference in production environments.\nBecause using MySQL is not just about running queries.\nIt is about understanding how the engine manages connections, privileges and resources under real load.\n","date":null,"permalink":"https://ivanluminaria.com/en/posts/mysql/","section":"Database Strategy","summary":"\u003cp\u003eMySQL is the database that needs no introduction.\u003cbr\u003e\nIt is the engine that powered the growth of the web for over twenty years.\u003cbr\u003e\u003c/p\u003e\n\u003cp\u003eBorn in 1995 in Sweden, in 2008 it was acquired by Sun Microsystems — and when Oracle completed its acquisition of Sun in 2010, MySQL ended up in the portfolio of the world\u0026rsquo;s largest commercial database vendor. I was an Oracle employee at the time, and I remember the atmosphere well: on one hand the curiosity of seeing how Oracle would manage such a popular open source product, on the other the concern that MySQL would be sidelined in favour of the proprietary database.\u003cbr\u003e\u003c/p\u003e","title":"MySQL"},{"content":"","date":null,"permalink":"https://ivanluminaria.com/en/tags/recovery/","section":"Tags","summary":"","title":"Recovery"},{"content":"","date":null,"permalink":"https://ivanluminaria.com/en/tags/replication/","section":"Tags","summary":"","title":"Replication"},{"content":"","date":null,"permalink":"https://ivanluminaria.com/en/tags/","section":"Tags","summary":"","title":"Tags"},{"content":"","date":null,"permalink":"https://ivanluminaria.com/en/tags/autovacuum/","section":"Tags","summary":"","title":"Autovacuum"},{"content":"","date":null,"permalink":"https://ivanluminaria.com/en/tags/bloat/","section":"Tags","summary":"","title":"Bloat"},{"content":"","date":null,"permalink":"https://ivanluminaria.com/en/tags/mvcc/","section":"Tags","summary":"","title":"Mvcc"},{"content":"","date":null,"permalink":"https://ivanluminaria.com/en/tags/performance/","section":"Tags","summary":"","title":"Performance"},{"content":"","date":null,"permalink":"https://ivanluminaria.com/en/categories/postgresql/","section":"Categories","summary":"","title":"Postgresql"},{"content":"PostgreSQL is not just an open source database.\nIt is the result of nearly four decades of academic and industrial evolution.\nBorn in 1986 at the University of Berkeley as an evolution of Ingres, the original POSTGRES project introduced concepts that were ahead of its time: extensibility, custom data types, rules and an advanced relational model.\nIn 1996 SQL support was added and the name became PostgreSQL.\nThe world, however, kept calling it simply “Postgres”.\nAnd that’s perfectly fine.\nIn this section I explore PostgreSQL from an architectural and operational perspective: design, performance, security and technical decisions applicable to real-world environments.\nBecause choosing PostgreSQL is not just choosing an open source database.\nIt is choosing an engine designed to be extended, analyzed and truly understood.\n","date":null,"permalink":"https://ivanluminaria.com/en/posts/postgresql/","section":"Database Strategy","summary":"\u003cp\u003ePostgreSQL is not just an open source database.\u003cbr\u003e\nIt is the result of nearly four decades of academic and industrial evolution.\u003cbr\u003e\u003c/p\u003e\n\u003cp\u003eBorn in 1986 at the University of Berkeley as an evolution of Ingres, the original POSTGRES project introduced concepts that were ahead of its time: extensibility, custom data types, rules and an advanced relational model.\u003cbr\u003e\u003c/p\u003e\n\u003cp\u003eIn 1996 SQL support was added and the name became PostgreSQL.\u003cbr\u003e\nThe world, however, kept calling it simply “Postgres”.\u003cbr\u003e\nAnd that’s perfectly fine.\u003cbr\u003e\u003c/p\u003e","title":"PostgreSQL"},{"content":"","date":null,"permalink":"https://ivanluminaria.com/en/tags/vacuum/","section":"Tags","summary":"","title":"Vacuum"},{"content":"A couple of years ago I was asked to look at a production PostgreSQL instance that \u0026ldquo;slows down every week\u0026rdquo;. Always the same pattern: Monday is fine, Friday is a disaster. Someone restarts the service over the weekend and the cycle starts again.\nDatabase around 200 GB. Main tables occupying nearly three times their actual data size. Queries falling into sequential scans where they shouldn\u0026rsquo;t have been. Response times climbing day after day.\nAutovacuum was enabled. Nobody had disabled it. But nobody had configured it either.\n🧠 MVCC: why PostgreSQL generates \u0026ldquo;garbage\u0026rdquo; #To understand the problem, you need a step back. PostgreSQL uses MVCC — Multi-Version Concurrency Control. Every time you run an UPDATE, the database doesn\u0026rsquo;t overwrite the original row. It creates a new version and marks the old one as \u0026ldquo;dead\u0026rdquo;.\nSame for DELETEs: the row isn\u0026rsquo;t physically removed. It\u0026rsquo;s marked as no longer visible to new transactions.\nThese dead rows are called dead tuples. They stay inside the data pages, taking up disk space and slowing down scans.\nThat\u0026rsquo;s the price PostgreSQL pays for transactional isolation without exclusive read locks. A fair price — as long as someone sweeps up afterwards.\n🔧 VACUUM: what it actually does #The VACUUM command does one simple thing: it reclaims space taken by dead tuples and makes it reusable for new inserts.\nIt doesn\u0026rsquo;t return space to the operating system. It doesn\u0026rsquo;t reorganize the table. It doesn\u0026rsquo;t compact anything. It marks pages as rewritable.\nVACUUM reporting.transactions; That\u0026rsquo;s enough in most cases. VACUUM is lightweight, doesn\u0026rsquo;t block writes, and can run alongside normal queries.\nWhat about VACUUM FULL? #VACUUM FULL is a different beast. It physically rewrites the entire table, eliminating all dead space. It returns space to the filesystem.\nBut the cost is brutal: it takes an exclusive lock on the table for the entire duration. No reads, no writes. On large tables we\u0026rsquo;re talking minutes or hours.\nVACUUM FULL reporting.transactions; In production, VACUUM FULL should be used very rarely. In emergencies. And always off-hours.\n⚙️ Autovacuum: the silent janitor #PostgreSQL has a daemon that runs VACUUM automatically: autovacuum.\nIt kicks in when a table accumulates enough dead tuples. The threshold is calculated like this:\nvacuum threshold = autovacuum_vacuum_threshold + autovacuum_vacuum_scale_factor × n_live_tup The defaults:\nautovacuum_vacuum_threshold: 50 dead tuples autovacuum_vacuum_scale_factor: 0.2 (20%) In plain terms: on a table with 10 million rows, autovacuum fires when dead tuples exceed 2,000,050. Two million dead rows before anyone cleans up.\nFor a table with 500,000 updates per day, that means autovacuum triggers maybe every 4 days. In the meantime bloat grows, scans slow down, indexes swell.\nThat\u0026rsquo;s why Monday was fine and Friday was a disaster.\n📊 Diagnostics: reading pg_stat_user_tables #The first thing to do when you suspect a vacuum problem is to query pg_stat_user_tables:\nSELECT schemaname, relname, n_live_tup, n_dead_tup, round(100.0 * n_dead_tup / NULLIF(n_live_tup + n_dead_tup, 0), 1) AS dead_pct, last_vacuum, last_autovacuum, autovacuum_count, vacuum_count FROM pg_stat_user_tables WHERE n_dead_tup \u0026gt; 10000 ORDER BY n_dead_tup DESC; In my client\u0026rsquo;s case, the picture looked like this:\nrelname | n_live_tup | n_dead_tup | dead_pct | last_autovacuum -------------------+------------+------------+----------+------------------ transactions | 12,400,000 | 3,800,000 | 23.5% | 3 days ago order_lines | 8,200,000 | 2,100,000 | 20.4% | 4 days ago inventory_moves | 5,600,000 | 1,900,000 | 25.3% | 5 days ago Nearly a quarter of the rows were dead. Autovacuum was running, but far too infrequently to keep up.\n🎯 Tuning: adapting autovacuum to reality #The trick isn\u0026rsquo;t to disable autovacuum. Never. The trick is to configure it for the tables that need it.\nPostgreSQL lets you set autovacuum parameters per table:\nALTER TABLE reporting.transactions SET ( autovacuum_vacuum_scale_factor = 0.01, autovacuum_vacuum_threshold = 1000 ); With this setting, autovacuum fires after 1,000 + 1% of live rows worth of dead tuples. On 12 million rows, it kicks in at ~121,000 dead tuples instead of 2 million.\ncost_delay: don\u0026rsquo;t throttle the vacuum #Another critical parameter is autovacuum_vacuum_cost_delay. It controls how much vacuum \u0026ldquo;slows itself down\u0026rdquo; to avoid overloading I/O.\nThe default is 2 milliseconds. On modern servers with SSDs, that\u0026rsquo;s too conservative. Reducing it to 0 or 1 ms lets vacuum finish faster:\nALTER TABLE reporting.transactions SET ( autovacuum_vacuum_cost_delay = 0 ); max_workers #The default is 3 autovacuum workers. If you have dozens of high-traffic tables, 3 workers aren\u0026rsquo;t enough. Consider raising to 5–6, while monitoring CPU and I/O impact:\n-- in postgresql.conf autovacuum_max_workers = 5 📏 Measuring bloat #How do you know how much space your tables are wasting?\nThe classic query uses pgstattuple:\nCREATE EXTENSION IF NOT EXISTS pgstattuple; SELECT pg_size_pretty(pg_total_relation_size(\u0026#39;reporting.transactions\u0026#39;)) AS total_size, pg_size_pretty(pg_total_relation_size(\u0026#39;reporting.transactions\u0026#39;) - pg_relation_size(\u0026#39;reporting.transactions\u0026#39;)) AS index_size, * FROM pgstattuple(\u0026#39;reporting.transactions\u0026#39;); Key fields: dead_tuple_percent and free_space. If dead_tuple exceeds 20–30%, the table has a serious problem.\nA less precise but lighter alternative is estimating bloat ratio by comparing pg_class.relpages with estimated rows — there are well-known queries in the community for this (the classic \u0026ldquo;bloat estimation query\u0026rdquo; from PostgreSQL Experts).\n🛠️ When VACUUM isn\u0026rsquo;t enough: pg_repack #If bloat is already out of control — tables at 50–70% dead space — regular VACUUM won\u0026rsquo;t reclaim everything. It frees dead tuples, but fragmented space remains.\nVACUUM FULL works but locks everything.\nThe production alternative is pg_repack: it rebuilds the table online, without prolonged exclusive locks.\npg_repack -d mydb -t reporting.transactions This isn\u0026rsquo;t a weekly solution. It\u0026rsquo;s the heavy-duty fix for when things have already gone south. The real solution is to never get there, with a well-configured autovacuum.\n💬 The principle #Disabling autovacuum is the worst thing you can do to a production PostgreSQL. I\u0026rsquo;ve seen it done \u0026ldquo;because it slows down queries during the day\u0026rdquo;. Sure, because in the meantime bloat is eating your database from the inside.\nAutovacuum with PostgreSQL defaults is designed for a generic database. No production database is generic. Every table has its own write pattern, its own volume, its own rhythm.\nThree things to take away:\nCheck pg_stat_user_tables regularly. If n_dead_tup grows faster than autovacuum can clean, you have a problem.\nConfigure scale_factor and threshold for high-traffic tables. There\u0026rsquo;s no universal configuration.\nDon\u0026rsquo;t wait until bloat reaches 50% to act. At that point your options are few and all painful.\nDatabases don\u0026rsquo;t maintain themselves. Not even the ones that have a daemon trying to.\nGlossary #VACUUM — PostgreSQL command that reclaims space occupied by dead tuples, making it reusable for new inserts without returning it to the operating system.\nMVCC — Multi-Version Concurrency Control — PostgreSQL\u0026rsquo;s concurrency model that maintains multiple row versions to ensure transactional isolation without exclusive locks on reads.\nDead Tuple — Obsolete row in a PostgreSQL table, marked as no longer visible after an UPDATE or DELETE but not yet physically removed from disk.\nAutovacuum — PostgreSQL daemon that automatically runs VACUUM and ANALYZE on tables when the number of dead tuples exceeds a configurable threshold.\nBloat — Dead space accumulated in a PostgreSQL table or index due to unremoved dead tuples, inflating disk size and degrading query performance.\n","date":"24 March 2026","permalink":"https://ivanluminaria.com/en/posts/postgresql/vacuum-autovacuum-postgresql/","section":"Database Strategy","summary":"\u003cp\u003eA couple of years ago I was asked to look at a production PostgreSQL\ninstance that \u0026ldquo;slows down every week\u0026rdquo;. Always the same pattern: Monday\nis fine, Friday is a disaster. Someone restarts the service over the\nweekend and the cycle starts again.\u003c/p\u003e\n\u003cp\u003eDatabase around 200 GB. Main tables occupying nearly three times their\nactual data size. Queries falling into sequential scans where they\nshouldn\u0026rsquo;t have been. Response times climbing day after day.\u003c/p\u003e","title":"VACUUM and autovacuum: why PostgreSQL needs someone to clean up"},{"content":"","date":null,"permalink":"https://ivanluminaria.com/en/tags/ai/","section":"Tags","summary":"","title":"Ai"},{"content":"A few months ago, during a meeting with a banking client, the CTO said something that stuck with me.\n\u0026ldquo;We need someone who manages AI. Not someone who uses it — someone who governs it.\u0026rdquo;\nI nodded without speaking. Because that sentence, in seven seconds, described a role the market is looking for without yet knowing what to call it.\n🧩 The fundamental misunderstanding #There is a widespread confusion, and I see it in every project where AI enters the picture.\nThe confusion is this: thinking that \u0026ldquo;adopting AI\u0026rdquo; means integrating a model, connecting an API, having an assistant generate text or code.\nNo. That is the technical side. The operational side. That is the work of a data scientist or an ML engineer. Important work, to be clear. But it is not the work of someone who governs.\nGoverning AI in a project means answering questions that no model can answer for you:\nWhere does AI create real value and where does it only create enthusiasm? How much does it cost to maintain, not just to implement? What happens when the model gets it wrong — and who is accountable? How does it integrate with existing architectures without compromising stability and security? How do you ensure alignment between data governance, compliance and automation? If you do not have answers to these questions, you are not governing AI. You are being governed by it.\n🏗️ It is not a new role. It is a role that did not have a name yet #When I think about it, I realize I have been doing this work long before anyone coined the label \u0026ldquo;AI Manager\u0026rdquo;.\nThirty years of data architectures. Mission-critical systems in Telco, Banking, Insurance, Public Administration. Environments where data is not an asset to monetize — it is infrastructure to protect.\nIn those contexts I have always done the same thing: connecting strategy to technical reality. Translating business needs into solutions that actually work, not on a slide but in production. Mediating between those who want everything now and those who know that certain things require time and architecture.\nAI has not changed this pattern. It has made it more visible.\nBecause AI, unlike a database or an ETL pipeline, is a topic that excites boards and scares engineers. Everyone wants a piece of it, few know where to put it. And the role of the person in between — between management enthusiasm and infrastructure caution — becomes crucial.\n📍 Where AI creates real value (and where it does not) #I have learned something over the past three years, working with AI in real project contexts: the value of AI is almost never where people think it is.\nIt is not in automatic code generation. Not in the chatbot answering customers. Not in the report that writes itself.\nReal value lies in three places:\n1. Accelerating analysis\nAI is devastating when it needs to analyze context. Reading thousands of lines of code, correlating logs, spotting patterns. What costs a senior engineer two hours, AI does in seconds. Not better — faster. And speed, in a project with deadlines, is money.\n2. Reducing decision noise\nIn every complex project there is a moment when information is too much and the team no longer knows what is urgent versus what is important. AI can do triage. It can classify, prioritize, highlight anomalies. It does not decide for you — it presents data in a way that makes the decision clearer.\n3. Documentation and knowledge transfer\nNobody likes documenting. Nobody. AI can generate documentation from code, commits, issues. Not perfect, but enough to avoid losing knowledge when someone leaves the project. And anyone who has managed projects knows how much that costs.\nEverything else — the shiny demos, the presentations with bold percentages, the vendors promising triple-digit ROI — is noise. The AI Manager is the one who separates the signal from the noise.\n⚖️ The triangle the PM must govern #In every project where AI enters a regulated environment, there is a triangle that keeps coming back:\nData governance — Compliance — Automation.\nYou can have the most efficient automation in the world, but if it violates data governance policies, it is a risk. You can have impeccable governance, but if it blocks every form of automation, the project stalls. You can be perfectly compliant, but if you do not know which data you are using to train or query the model, compliance is only on paper.\nThe AI Manager must keep these three vertices in balance. Continuously. Not once at the start of the project — every week.\nI have seen projects where AI was integrated without anyone verifying the provenance of the training data. In banking. With data subject to GDPR. The DPO found out three months later.\nThat is not incompetence. It is absence of governance. It is the absence of someone asking the right question at the right time.\n🔬 Integrate, do not replace #Something I repeat in every kickoff meeting: AI integrates into existing architectures. It does not replace them.\nIt sounds obvious, yet the temptation is always the same: the vendor proposing to \u0026ldquo;rethink the infrastructure through an AI lens\u0026rdquo;, the consultant who wants a greenfield architecture, the manager who saw a demo and now wants everything new.\nNo.\nMission-critical architectures are not thrown away because a new technology has arrived. They evolve. They extend. They are protected.\nThe AI Manager is the person who says \u0026ldquo;this model plugs in here, with these precautions, with this fallback plan\u0026rdquo;. Not the person who says \u0026ldquo;let us throw everything away and rebuild with AI\u0026rdquo;.\nIn thirty years of systems, I have seen at least five \u0026ldquo;revolutionary\u0026rdquo; technologies that were supposed to change everything. Client-server. Internet. Cloud. Big Data. Now AI. None of them changed everything. Each of them changed something. And those who governed the change well were those who integrated it with intelligence — not with enthusiasm.\n🎯 Why AI is not magic #There is a phrase I use often, and I never tire of repeating it.\nAI is not magic. It is architecture applied to intelligence.\nA model is a component. Like a database, like a message broker, like a load balancer. It needs clean inputs, monitoring, maintenance, governance. It needs someone who understands what it does, what it can do, and above all what it cannot do.\nThe Project Manager who ignores these aspects and delegates everything to the technical team is making the same mistake as the one who delegated security to the sysadmin and then was surprised by the data breach.\nAI is an architectural responsibility. And like all architectural responsibilities, it must be governed from above. Not from below.\n💬 To those deciding whether AI \u0026ldquo;belongs\u0026rdquo; in their project #If you are evaluating whether to introduce AI in a project — not an experiment, a real project, with deadlines, budget and stakeholders — here is one piece of advice worth more than any tool.\nDo not start from the technology. Start from the problem.\nWhat is the bottleneck? Where does the team lose the most time? Where are decisions slower than they need to be? Where is knowledge being lost?\nIf the answer to any of these questions involves analyzing large volumes of data, classifying information, or accelerating repetitive processes — then AI can help. But only if someone governs it.\nAnd governing it does not mean controlling it. It means understanding it well enough to know when to trust it and when not to.\nThat is the job of the AI Manager. And whether you call it that or not, it is a role every serious project will need.\nGlossary #AI Manager — Professional role that governs the impact of artificial intelligence on architectures, processes and people, separating real value from noise.\nData Governance — Set of policies, processes and standards that ensure data quality, security and compliance within an organization.\nKnowledge Transfer — Process of transferring knowledge between people, teams or systems, critical in IT projects where know-how loss can compromise operational continuity.\nROI — Return on Investment — ratio between benefit obtained and cost incurred for an investment.\nCompliance — Adherence to applicable regulations, rules and standards — in the AI context includes GDPR, industry regulations and internal policies on data and model usage.\n","date":"17 March 2026","permalink":"https://ivanluminaria.com/en/posts/project-management/ai-manager-project-management/","section":"Database Strategy","summary":"\u003cp\u003eA few months ago, during a meeting with a banking client, the CTO said something that stuck with me.\u003c/p\u003e\n\u003cblockquote\u003e\n\u003cp\u003e\u0026ldquo;We need someone who manages AI. Not someone who uses it — someone who governs it.\u0026rdquo;\u003c/p\u003e\n\u003c/blockquote\u003e\n\u003cp\u003eI nodded without speaking. Because that sentence, in seven seconds, described a role the market is looking for without yet knowing what to call it.\u003c/p\u003e\n\u003chr\u003e\n\u003ch2 id=\"-the-fundamental-misunderstanding\" class=\"relative group\"\u003e🧩 The fundamental misunderstanding \u003cspan class=\"absolute top-0 w-6 transition-opacity opacity-0 -start-6 not-prose group-hover:opacity-100\"\u003e\u003ca class=\"group-hover:text-primary-300 dark:group-hover:text-neutral-700\" style=\"text-decoration-line: none !important;\" href=\"#-the-fundamental-misunderstanding\" aria-label=\"Anchor\"\u003e#\u003c/a\u003e\u003c/span\u003e\u003c/h2\u003e\u003cp\u003eThere is a widespread confusion, and I see it in every project where AI enters the picture.\u003c/p\u003e","title":"AI Manager and Project Management: when artificial intelligence enters the project"},{"content":"","date":null,"permalink":"https://ivanluminaria.com/en/tags/ai-manager/","section":"Tags","summary":"","title":"Ai-Manager"},{"content":"","date":null,"permalink":"https://ivanluminaria.com/en/tags/governance/","section":"Tags","summary":"","title":"Governance"},{"content":"","date":null,"permalink":"https://ivanluminaria.com/en/categories/project-management/","section":"Categories","summary":"","title":"Project Management"},{"content":"I\u0026rsquo;ve seen project managers make senior developers cry in meetings. I\u0026rsquo;ve seen brilliant teams destroyed by PMs who confused authority with authoritarianism. I\u0026rsquo;ve seen software delivered \u0026ldquo;on time\u0026rdquo; that didn\u0026rsquo;t work, and multi-million euro projects that ended in nothing.\nAnd I\u0026rsquo;ve seen the exact opposite: small, autonomous, respected teams — building solid systems in a fraction of the time and budget.\nThe difference was never technology. It was always the method.\nAfter thirty years in this profession, I\u0026rsquo;ve understood one thing: people don\u0026rsquo;t give their best when they\u0026rsquo;re afraid. They give their best when they have trust.\nTrust in the team. Trust in the process. Trust that if they make mistakes, they won\u0026rsquo;t be punished — they\u0026rsquo;ll be helped.\nThe project manager who works isn\u0026rsquo;t the one who controls, terrorises and counts hours. It\u0026rsquo;s the one who:\nsets clear objectives and gives the team the freedom to achieve them builds deep competencies and protects them from turnover does team building every day, not once a year at the go-kart track measures results, not hours in the office uses smart working as a competitive advantage, not as a concession when something goes wrong, stands in front of the team, not behind it 📊 How I work #My approach is Scrum — but Scrum done properly, not the liturgy of post-it notes.\nScrum that works is based on a pact: radical transparency, shared responsibility, and autonomy in the \u0026ldquo;how\u0026rdquo;. The PM defines the what and the why. The team decides the how. Because those who write the code know better than anyone how to write it.\nAnd this pact works even better with smart working. The daily standup? 15 minutes on a call. The sprint review? Screen sharing with the client, who sees the software — not slides. The retrospective? An hour where the team tells the truth, because there\u0026rsquo;s no PM two metres away staring them down.\nI measure few things, but I measure them well:\nMetric What it tells you Velocity How much the team delivers per sprint Lead time How long from request to release Bug escape rate How many defects slip into production Sprint goal success % of sprint goals achieved Team happiness How the team feels — the most important metric The last one — team happiness — is the one traditional PMs never measure. A happy team isn\u0026rsquo;t a team that\u0026rsquo;s having fun. It\u0026rsquo;s a team that feels respected, heard and valued. And a team like that produces more. Not because they work more hours. Because they work better.\n📚 What I write about here #True stories, hard numbers and lessons learned. No textbook theory. Just what I\u0026rsquo;ve seen work — and what I\u0026rsquo;ve seen fail.\nI write about artificial intelligence applied to workflow, IT consulting and its hidden costs, smart working as a competitive advantage. Every article comes from a real experience — mine.\nNo revolutions needed. Just precise choices, implemented with method.\nAnd the ability to say \u0026ldquo;no\u0026rdquo; to those who sell you complexity when the solution is simple.\n","date":null,"permalink":"https://ivanluminaria.com/en/posts/project-management/","section":"Database Strategy","summary":"\u003cp\u003eI\u0026rsquo;ve seen project managers make senior developers cry in meetings. I\u0026rsquo;ve seen brilliant teams destroyed by PMs who confused authority with authoritarianism. I\u0026rsquo;ve seen software delivered \u0026ldquo;on time\u0026rdquo; that didn\u0026rsquo;t work, and multi-million euro projects that ended in nothing.\u003c/p\u003e\n\u003cp\u003eAnd I\u0026rsquo;ve seen the exact opposite: small, autonomous, respected teams — building solid systems in a fraction of the time and budget.\u003c/p\u003e\n\u003cp\u003eThe difference was never technology. It was always \u003cstrong\u003ethe method\u003c/strong\u003e.\u003c/p\u003e","title":"Project Management"},{"content":"","date":null,"permalink":"https://ivanluminaria.com/en/tags/project-management/","section":"Tags","summary":"","title":"Project-Management"},{"content":"","date":null,"permalink":"https://ivanluminaria.com/en/tags/strategy/","section":"Tags","summary":"","title":"Strategy"},{"content":"","date":null,"permalink":"https://ivanluminaria.com/en/tags/consulting/","section":"Tags","summary":"","title":"Consulting"},{"content":"A data warehouse is not a database with bigger tables.\nIt is a different way of thinking about data — oriented towards analysis, history, decisions.\nThe difference between a DWH that works and one that becomes a problem almost always lies in the model. Fact tables with the wrong granularity, poorly designed dimensions, hierarchies that cannot support aggregation queries. Problems that are invisible during development but explode when the business asks for reports the model cannot deliver.\nIn this section I share real cases of data warehouse design and restructuring: dimensional modeling, balanced hierarchies, slowly changing dimensions, loading strategies. Not Kimball textbook theory, but solutions applied in production on systems that serve real business decisions.\nBecause a data warehouse is not built to contain data.\nIt is built to answer questions.\n","date":null,"permalink":"https://ivanluminaria.com/en/posts/data-warehouse/","section":"Database Strategy","summary":"\u003cp\u003eA data warehouse is not a database with bigger tables.\u003cbr\u003e\nIt is a different way of thinking about data — oriented towards analysis, history, decisions.\u003cbr\u003e\u003c/p\u003e\n\u003cp\u003eThe difference between a DWH that works and one that becomes a problem almost always lies in the model. Fact tables with the wrong granularity, poorly designed dimensions, hierarchies that cannot support aggregation queries. Problems that are invisible during development but explode when the business asks for reports the model cannot deliver.\u003cbr\u003e\u003c/p\u003e","title":"Data Warehouse"},{"content":"","date":null,"permalink":"https://ivanluminaria.com/en/tags/european-union/","section":"Tags","summary":"","title":"European-Union"},{"content":"","date":null,"permalink":"https://ivanluminaria.com/en/tags/freelance/","section":"Tags","summary":"","title":"Freelance"},{"content":"","date":null,"permalink":"https://ivanluminaria.com/en/tags/italy/","section":"Tags","summary":"","title":"Italy"},{"content":"Oracle is the database that shaped me professionally.\nI have been working with it since 1996, and in nearly thirty years I have seen versions, paradigms and trends come and go — but the core of the engine has remained the same: solid, complex, unforgiving to those who do not know it deeply.\nI have managed instances with a few hundred users and data warehouses with billions of rows. I configured Data Guard when it was still called standby database, I wrote PL/SQL when debugging meant DBMS_OUTPUT and patience, I designed partitioning schemes before they became a marketing deck feature.\nOracle is not a database you learn from tutorials.\nYou learn it from incidents, from migrations at three in the morning, from execution plans that change after a statistics update.\nIn this section I share what I have learned in the field: architecture, security, performance and the design decisions that separate an installation that works from one that merely survives.\nBecause with Oracle, knowing the syntax is not enough.\nYou need to understand how the engine thinks.\n","date":null,"permalink":"https://ivanluminaria.com/en/posts/oracle/","section":"Database Strategy","summary":"\u003cp\u003eOracle is the database that shaped me professionally.\u003cbr\u003e\nI have been working with it since 1996, and in nearly thirty years I have seen versions, paradigms and trends come and go — but the core of the engine has remained the same: solid, complex, unforgiving to those who do not know it deeply.\u003cbr\u003e\u003c/p\u003e\n\u003cp\u003eI have managed instances with a few hundred users and data warehouses with billions of rows. I configured Data Guard when it was still called standby database, I wrote PL/SQL when debugging meant DBMS_OUTPUT and patience, I designed partitioning schemes before they became a marketing deck feature.\u003cbr\u003e\u003c/p\u003e","title":"Oracle"},{"content":"The first time I worked with an international client, something strange happened. They paid me in thirty days.\nNot thirty days from the end of the month. Not thirty days from the invoice receipt date stamped and countersigned by the administration manager. Thirty days from the invoice. Period.\nI checked my bank statement twice. I thought it was a mistake.\nIt wasn\u0026rsquo;t a mistake. It was normality — just not Italian normality.\n🇮🇹 Italian normality: waiting is part of the job #In Italy, if you\u0026rsquo;re an IT consultant working as a freelancer, the payment cycle goes roughly like this:\nYou work in October. You invoice at the end of October. The invoice enters the client\u0026rsquo;s administrative cycle in November. The client pays at \u0026ldquo;60 days end of month\u0026rdquo; — which in practice means the end of January. If things go well. If there\u0026rsquo;s no year-end payment freeze. If the administration office hasn\u0026rsquo;t \u0026ldquo;lost\u0026rdquo; the invoice. If the manager has signed the approval. Result: you work in October, you see the money in February. Four months.\nAnd I\u0026rsquo;m not talking about pathological situations. I\u0026rsquo;m talking about standard contractual practice in Italian IT consulting. Contracts that explicitly state \u0026ldquo;payment at 60 days from invoice date end of month\u0026rdquo; or, worse, \u0026ldquo;90 days from invoice date end of month.\u0026rdquo;\nI\u0026rsquo;ve seen contracts at 120 days. One hundred and twenty days. Four months stated in the contract, which in reality become five or six. Signed without flinching, because \u0026ldquo;that\u0026rsquo;s how it works\u0026rdquo; and because the consultant who complains is the consultant who doesn\u0026rsquo;t get called back.\n🇪🇺 What Europe says (and what Italy pretends not to hear) #The European Directive 2011/7/EU on combating late payment in commercial transactions is clear. Crystal clear, I\u0026rsquo;d say:\nRule What it provides Standard term 30 days from invoice date Maximum term between businesses 60 days (only with explicit agreement and if not grossly unfair) Term for Public Administration 30 days (extendable to 60 only in exceptional cases) Automatic late payment interest ECB rate + 8% — without need for formal notice Fixed compensation for recovery €40 minimum per invoice paid late Thirty days. Not ninety. Not one hundred and twenty. Thirty.\nThe directive was transposed into Italian law with Legislative Decree 231/2002 (amended in 2012). On paper, the rules exist. In practice, it\u0026rsquo;s as if they don\u0026rsquo;t.\n🇩🇪 How it works in the rest of Europe #When I tell my German, Dutch or Scandinavian colleagues that standard payment terms in Italy are 60-90 days, the reaction is always the same: first astonishment, then a nervous laugh.\nIn Germany the average payment term is 24 days. Not because Germans are more generous — because the system enforces it. A payment beyond 30 days automatically generates late payment interest. Companies know this and they pay.\nIn the Netherlands the average term is 27 days. The trade association MKB-Nederland publishes annual statistics on delays, and companies that don\u0026rsquo;t respect terms end up on public lists.\nIn Nordic countries — Sweden, Denmark, Finland — paying at 14 days is normal. At 30 it\u0026rsquo;s already considered long.\nAnd Italy? The actual average payment term in Italy is 80 days according to the European Payment Report. Eighty. Almost three times the European average.\nCountry Average payment term (days) Sweden 27 Germany 24 Netherlands 27 France 44 Spain 56 Italy 80 EU Directive maximum 60 Italy isn\u0026rsquo;t just off-scale compared to northern Europe. It\u0026rsquo;s off-scale compared to its own law.\n💰 The real impact on those who work #Let\u0026rsquo;s take a concrete example. A senior IT consultant billing €250 per day, 220 working days per year.\nAnnual gross revenue: €55,000.\nWith 30-day payments, cash flow is manageable. Each month, last month\u0026rsquo;s work comes in. The consultant can plan, invest, pay taxes without going into the red.\nWith 90-day end-of-month payments, the picture changes radically:\nMonth Work done Collected Outstanding credit January €5,500 €0 €5,500 February €5,500 €0 €11,000 March €5,500 €0 €16,500 April €5,500 €0 €22,000 May €5,500 €5,500 €22,000 For the first four months the consultant works on credit. Sixteen to twenty thousand euros of work done and unpaid. Meanwhile, taxes don\u0026rsquo;t wait. Social security contributions don\u0026rsquo;t wait. Rent doesn\u0026rsquo;t wait.\nThat consultant is financing their client. For free. Without interest. Without guarantees.\nIf you think about it, it\u0026rsquo;s a loan. Only nobody calls it that. They call it \u0026ldquo;contractual terms.\u0026rdquo;\n🏦 The perverse mechanism: those who pay finance those who collect #The paradox is structural. Large consulting firms — the ones with hundreds of employees and seven-figure revenues — negotiate long terms with suppliers (i.e., consultants) and shorter terms with end clients. The difference between collection and payment becomes a financial float that the company uses as zero-cost liquidity.\nThe freelance consultant is the weakest link in the chain. No negotiating power, no legal department, no immediate alternatives. They accept the 90 days because the market works that way and because refusing means staying without projects.\nI\u0026rsquo;ve seen even more creative situations:\nContracts specifying \u0026ldquo;90 days from timesheet approval date\u0026rdquo; — where approval arrives weeks late Invoices \u0026ldquo;blocked\u0026rdquo; for invented formal errors, to push payment to the next cycle Payments split without apparent reason, with the final balance arriving after six months These aren\u0026rsquo;t exceptions. They\u0026rsquo;re consolidated tactics. Every Italian consultant with a bit of experience recognises them instantly.\n⚖️ Why nobody enforces the law #The legitimate question is: if the law provides for 30 days and late payment interest is automatic, why does nobody claim it?\nThe answer is simple and bitter: because the reputational cost is higher than the financial cost.\nA consultant who sends a formal request for late payment interest to their client is a consultant who won\u0026rsquo;t be called again. Not because they\u0026rsquo;re wrong — they\u0026rsquo;re right, legally and morally. But because the IT consulting market in Italy is a relationship market, and in a relationship market, whoever asserts their rights is perceived as \u0026ldquo;difficult.\u0026rdquo;\nIt\u0026rsquo;s the same mechanism by which nobody asks for paid overtime, nobody refuses the Friday evening business trip, and nobody contests the rate that drops with every contract renewal.\nThe system runs on the structural docility of the consultant. And it works, as long as it works.\n🛠️ What you can actually do #I don\u0026rsquo;t have magic solutions. But I have strategies I\u0026rsquo;ve been using for years that work — or at least limit the damage.\n1. Negotiate terms BEFORE signing #It seems obvious, yet most consultants sign the contract without reading the payment clause. The moment to negotiate is before, not after. A \u0026ldquo;payment at 30 days from invoice date\u0026rdquo; written in the contract is worth more than any subsequent protest.\n2. Include an explicit late payment interest clause #Even though the law provides for it automatically, writing it in the contract has a deterrent effect. The client knows you\u0026rsquo;re not joking. I use a simple formula: \u0026ldquo;In case of late payment, late payment interest shall apply pursuant to Legislative Decree 231/2002, equal to the ECB rate plus 8 percentage points.\u0026rdquo;\n3. Diversify your clients #The consultant with a single client is the most vulnerable consultant. If 100% of your revenue depends on someone who pays at 120 days, you have no negotiating leverage. If you have three clients and one is a chronic late payer, you can afford to let them go.\n4. Evaluate clients on payment punctuality too #Before accepting an engagement, ask your colleagues. The IT consulting world is small. Who pays badly is known. Who pays well, too. I keep a mental list — and it\u0026rsquo;s not short.\n5. Consider international clients #Not out of reverse patriotism, but out of pragmatism. Northern European clients pay better, pay sooner and have more transparent administrative processes. With remote working, working for a Dutch company from Rome is no longer science fiction. It\u0026rsquo;s my Tuesday.\n📊 A comparison that speaks volumes #Take the same IT consultant — same skills, same seniority, same type of project. Only the client\u0026rsquo;s country changes.\n🇮🇹 Italian client 🇩🇪 German client Daily rate €250 €450 Contractual payment term 90 days end of month 30 days Actual average payment ~120 days ~28 days Late payment interest claimed never (out of fear) not needed (they pay on time) Invoice approval process labyrinthine one confirmation email Contract renewal \u0026ldquo;we\u0026rsquo;ll let you know\u0026rdquo; planned 3 months in advance I\u0026rsquo;m not saying that working with Italian clients is always worse. I\u0026rsquo;m saying that the numbers tell a precise story, and ignoring it for the sake of a quiet life isn\u0026rsquo;t professionalism. It\u0026rsquo;s resignation.\n💬 A matter of respect #Chronic late payment isn\u0026rsquo;t just a financial problem. It\u0026rsquo;s a problem of professional respect.\nWhen a company pays a consultant at 120 days, it\u0026rsquo;s telling them: \u0026ldquo;Your time and your work can wait. Our schedule comes before yours.\u0026rdquo;\nIt\u0026rsquo;s the same message conveyed when they ask you to start \u0026ldquo;Monday\u0026rdquo; but the contract arrives three weeks later. When the timesheet must be filled in to the minute but the invoice can sit in a drawer for months. When the project is extremely urgent but the payment isn\u0026rsquo;t.\nI have thirty years of consulting behind me. I\u0026rsquo;ve worked with clients who paid me in 14 days and clients who took eight months. The quality of the professional relationship has never been a matter of budget. It\u0026rsquo;s always been a matter of how they treat those who work for them.\nThe best clients I\u0026rsquo;ve had paid on time. Not out of generosity — out of organisation. Because a company that knows how to manage its payments is a company that knows how to manage its projects. And vice versa.\nThose who pay badly almost always manage everything else badly too.\nGlossary #DSO — Days Sales Outstanding — average number of days a company takes to collect its receivables. In Italy the average is 80 days, nearly three times the European average.\nFinancial Float — Zero-cost liquidity generated by the difference between collection times from clients and payment times to suppliers.\nLate Payment Interest — Automatic interest prescribed by law (ECB rate + 8%) accruing on every invoice paid late, without the need for formal notice.\nPartita IVA — Italian tax regime for self-employed workers and freelancers, which in IT consulting implies directly bearing the credit risk toward clients.\nDirective 2011/7/EU — European directive on late payments setting the standard term at 30 days, maximum at 60, and providing automatic late interest.\n","date":"10 March 2026","permalink":"https://ivanluminaria.com/en/posts/project-management/pagamenti-60-90-120-giorni/","section":"Database Strategy","summary":"\u003cp\u003eThe first time I worked with an international client, something strange happened. They paid me in thirty days.\u003c/p\u003e\n\u003cp\u003eNot thirty days from the end of the month. Not thirty days from the invoice receipt date stamped and countersigned by the administration manager. Thirty days from the invoice. Period.\u003c/p\u003e\n\u003cp\u003eI checked my bank statement twice. I thought it was a mistake.\u003c/p\u003e\n\u003cp\u003eIt wasn\u0026rsquo;t a mistake. It was normality — just not Italian normality.\u003c/p\u003e","title":"Payment at 60-90-120 days: the Italian normality that doesn't exist in Europe"},{"content":"","date":null,"permalink":"https://ivanluminaria.com/en/tags/payments/","section":"Tags","summary":"","title":"Payments"},{"content":"Download PDF | LinkedIn Profile\nProfessional Profile #Data Warehouse Architect and IT professional with nearly 30 years of experience in designing, implementing, and managing complex, high-performing DWH solutions in Oracle and PostgreSQL environments. Expert in multidimensional data modeling methodologies (Kimball, Inmon) and in optimizing ETL/ELT processes and SQL queries on datasets ranging from hundreds of millions to billions of rows. Proven ability to lead end-to-end DWH projects — from requirements analysis to production deployment — ensuring data integrity, quality, and availability to support business decisions. Technical leadership and problem-solving mindset in international and full-remote contexts.\nTechnical Skills # DWH Methodologies: Multidimensional Data Modeling (Kimball, Inmon), Star Schema, Snowflake Schema, Slowly Changing Dimensions (SCD Type 1/2/3), Bus Matrix design. Oracle Databases: Oracle Database (8i through 21c), Oracle Exadata, Oracle RAC, Oracle Data Guard, Oracle Autonomous Database (ADB), Performance Tuning (AWR, ADDM, SQL Tuning Advisor), Storage Management (ASM), Backup \u0026amp; Recovery (RMAN), Oracle TDE. PostgreSQL: PostgreSQL (14+), Query Optimization, Table Partitioning, pg_stat_statements, PgBouncer, Logical Replication, VACUUM/Autovacuum tuning. Cloud Platforms: Oracle Cloud Infrastructure (OCI). ETL/ELT Tools: PL/SQL-based ETL pipelines, Unix Shell Scripting, Oracle Data Integrator (ODI), Oracle Warehouse Builder (OWB, legacy projects). Business Intelligence \u0026amp; Reporting: Oracle Analytics Cloud (OAC) — Semantic Model Designer, Reports, Dashboards. Programming \u0026amp; Scripting Languages: SQL (advanced), PL/SQL, Unix Shell Scripting. Operating Systems: Linux (RHEL, CentOS, Oracle Linux), Unix, Windows Server. Other: Project Management (Agile/Scrum), Team Leadership (up to 7 people), Technical Training. Work Experience #IDEA DB CONSULTING S.R.L. — Rome, Italy (Full Remote Europe) #Data Warehouse Architect | Oracle \u0026amp; PostgreSQL Expert | Sole Director | 2021 – Present\nDWH Architect (for ATRADIUS) | 2022 – 2026:\nDesigned the Surety division Data Warehouse to consolidate data from Italy, Spain, France, and Northern European countries. Integrated heterogeneous sources — Oracle databases, Microsoft SQL Server, and flat files from external systems — into a unified Oracle-based DWH. Modeled core business domains: client portfolio, policies, contracts, billing, claims, and claims transactions. Developed the entire data model and ETL layer — over 60K lines of PL/SQL code — with a real-time load monitoring system. Full daily ingestion from all three sources completes in under 2 hours. DWH Architect (for FAI SERVICE) | 2021 – 2023:\nDesigned a Snowflake Schema data model on Oracle Analytics Cloud with ETL processes running on Oracle 19c in OCI. Developed dashboards and reports on OAC for billing statistics, customer segmentation, client portfolio analysis, and cost/revenue tracking. DWH Design and Architecture (Banking, Telepass, and other clients):\nDefined and implemented Data Warehouse architectures for clients in the banking sector and Telepass partners, following Kimball and Inmon methodologies. Applied multidimensional data modeling (Star Schema, Snowflake) to optimize analysis and reporting across datasets exceeding 2 billion rows. Designed and developed ETL/ELT flows for integrating data from 15+ heterogeneous sources. Designed and implemented DWH solutions on PostgreSQL as cost-effective alternatives to Oracle-based architectures, including partitioning strategies and query performance optimization. Oracle Database Management and Optimization:\nAdvanced administration, tuning, and performance optimization of Oracle databases, including OCI and Autonomous Database environments. Optimization of complex SQL queries — reduced batch processing times from 4 hours to under 30 minutes on critical analytical workloads. Leadership and Consulting:\nTechnical guidance and training for internal teams and clients on DWH and Oracle/PostgreSQL best practices. Installation and patching of Oracle databases, Oracle OEM and RMAN management. Development of reports and dashboards with Oracle Analytics Cloud (OAC). NIMIS CONSULTING S.R.L. — Rome, Italy (Full Remote) #Oracle DBA | DWH Architect | Oracle Performance \u0026amp; Tuning Expert (for TIM / HUAWEI) | 2020 – 2022\nDWH/DBA Project Administration:\nManagement and administration of over 30 critical Oracle databases (70+ instances) on Oracle Exadata clusters (3 and 5 nodes) for a major telecom client. Responsible for 24/7 on-call availability of database systems supporting 20M+ prepaid mobile subscribers. Managed fact tables ingesting up to 800M phone traffic records per day, requiring advanced partitioning and compression strategies. Advanced Performance Tuning:\nIn-depth performance analysis via AWR, ADDM, and proactive tuning to ensure SLAs under 500ms response time on critical queries. Implementation of indexing, partitioning, and compression strategies to enhance DWH system performance. Data security management with Oracle TDE (Transparent Data Encryption). IDEA DB CONSULTING S.R.L. — Rome, Italy (Full Remote) #PL/SQL Expert | Oracle DBA \u0026amp; Tuning Expert (for FINWAVE S.p.A.) | 2020 – 2022\nAdvanced PL/SQL development and query optimization for financial applications processing millions of daily transactions. Consulting on Oracle architectures and performance tuning best practices. FREELANCE / INDEPENDENT CONSULTANT — Rome, Italy (Full Remote Europe) #Oracle DBA | Oracle Performance \u0026amp; Tuning Expert | DWH Architect | 2013 – 2020\nDesign and development of Data Warehouses for various clients across banking, insurance, and telecom sectors, applying Kimball/Inmon methodologies on Oracle and PostgreSQL platforms. Development of ETL/ELT processes for DWH data loading — built pipelines handling 500M+ rows per load cycle. Specialized consultancy on Oracle and PostgreSQL databases, SQL performance tuning, DWH architectures, PL/SQL, and ETL design. Team leadership (teams of 3–7 people) in multicultural contexts, managing tasks and priorities using Agile methodology. Created reports and performed data analysis with Oracle Analytics Cloud (OAC). AUSELDA AED GROUP S.P.A. — Rome, Italy #Data Warehouse Architect | Performance \u0026amp; Tuning Expert | Oracle Project DBA (for Public Administration) | 2009 – 2013\nData Warehouse design and modeling (Kimball/Inmon) for Public Administration entities. Development of ETL/ELT processes and optimization of complex SQL queries. Oracle Warehouse Builder (OWB) product specialist. Technical management of DWH projects and user training. ORACLE ITALIA S.R.L. — Various Locations, Italy \u0026amp; Madrid, Spain #Data Warehouse Architect | Oracle DBA | DWH Designer | SQL \u0026amp; PL/SQL Developer | Training Specialist | 1999 – 2009\nDWH Architectural and Development Roles (2001-2009):\nDesign and development of Data Warehouses for leading clients in Telco (TIM, Vodafone, TRE), Finance (Banca d\u0026rsquo;Italia, Generali, RAS), Pharmaceutical (Menarini), and other sectors. Implementation of data models (Kimball/Inmon) and ETL/ELT processes with Oracle Warehouse Builder. Consultancy on SQL Performance \u0026amp; Tuning, PL/SQL, BI Reports (Oracle Discoverer, HTMLDB), Oracle OLAP. International engagement with Vodafone Spain (Madrid). Training Specialist (2000-2001):\nDelivered Oracle courses: SQL (Basic and Advanced), PL/SQL (Basic and Advanced), Oracle Database Administration, Performance and Tuning, Oracle Discoverer, Forms, Reports. Web Developer | Oracle SQL \u0026amp; PL/SQL Developer (ETNOTEAM S.P.A. - 1999): Web portal and client-server application development.\nS.EL.DAT. S.P.A. — Rome, Italy #Software Developer (for Telecom, Rover Italia) | 1997 – 1999\nClient-server application development, Junior Oracle DBA, DB monitoring. Education and Training # Faculty of Computer Engineering (Software Engineering) | University of Rome Tre, Rome | 1994 – 2000 Scientific High School Diploma (EQF Level 4) | Liceo Scientifico Isacco Newton / Manieri Copernico, Rome | 1988 – 1993 Advanced English (C1/C2) | The British Council (Level 4A), Rome | 2003 – 2004 Selected Training and Continuing Education Courses:\nScrum Agile — Randstad/Forma.temp (May 2024) Project Management — Randstad/Forma.temp (May 2024) Data Wrangling, Analysis and AB Testing with SQL — Coursera, University of California Davis (April 2021) Data Science on Google Cloud Platform: Designing Data Warehouses — LinkedIn Learning (September 2020) Multiple Oracle 12c specialization courses (Administration, Security, Backup and Recovery, Advanced SQL, Performance Optimization) — LinkedIn Learning (2020) MySQL Installation and Configuration — LinkedIn Learning (August 2020) Learning Git and GitHub — LinkedIn Learning (August 2020) Languages # Italian: Native English: C1/C2 (Fluent, professional) Spanish: C1 (Fluent) Romanian: C1 (Fluent) French: A1/A2 (Basic) Soft Skills # Analytical and Creative Problem Solving Teamwork and Collaboration in International Contexts Project Management (Requirements definition, Prioritization, Gantt charts, ERD) Effective Communication and Stakeholder Engagement I consent to the processing of my personal data pursuant to Art. 13 of EU Regulation 2016/679 (GDPR).\nRome, March 2026\nDownload PDF | LinkedIn Profile | Back to previous page\n","date":"10 March 2026","permalink":"https://ivanluminaria.com/en/resumes/dwh-architect/","section":"Know-How \u0026 Impact","summary":"\u003cp\u003e\u003cstrong\u003e\u003ca href=\"https://ivanluminaria.com/downloads/CV_DWH_Architect_Ivan_Luminaria_202603_EN.pdf\" target=\"_blank\" rel=\"noreferrer\"\u003eDownload PDF\u003c/a\u003e\u003c/strong\u003e | \u003cstrong\u003e\u003ca href=\"https://www.linkedin.com/in/ivanluminaria\" target=\"_blank\" rel=\"noreferrer\"\u003eLinkedIn Profile\u003c/a\u003e\u003c/strong\u003e\u003c/p\u003e\n\u003chr\u003e\n\u003ch2 id=\"professional-profile\" class=\"relative group\"\u003eProfessional Profile \u003cspan class=\"absolute top-0 w-6 transition-opacity opacity-0 -start-6 not-prose group-hover:opacity-100\"\u003e\u003ca class=\"group-hover:text-primary-300 dark:group-hover:text-neutral-700\" style=\"text-decoration-line: none !important;\" href=\"#professional-profile\" aria-label=\"Anchor\"\u003e#\u003c/a\u003e\u003c/span\u003e\u003c/h2\u003e\u003cp\u003eData Warehouse Architect and IT professional with nearly 30 years of experience in designing, implementing, and managing complex, high-performing DWH solutions in Oracle and PostgreSQL environments. Expert in multidimensional data modeling methodologies (Kimball, Inmon) and in optimizing ETL/ELT processes and SQL queries on datasets ranging from hundreds of millions to billions of rows. Proven ability to lead end-to-end DWH projects — from requirements analysis to production deployment — ensuring data integrity, quality, and availability to support business decisions. Technical leadership and problem-solving mindset in international and full-remote contexts.\u003c/p\u003e","title":"Data Warehouse Architect"},{"content":"Download PDF | LinkedIn Profile\nProfessional Profile #Highly skilled Senior Oracle DBA and Performance Tuning Expert with nearly 30 years of specialized experience in the administration, optimization, and management of complex, mission-critical Oracle databases, including Exadata, RAC, and Oracle Cloud (OCI, Autonomous Database) environments. Deep expertise in performance analysis (AWR, ADDM, ASH), tuning complex SQL queries, instance optimization, and resolving performance issues. Foundational knowledge of practices related to backup \u0026amp; recovery (RMAN), security (TDE), installation, patching, migrations, and storage management (ASM), with performance tuning remaining the primary and in-depth area of expertise, crucial for the integrity, performance, and contribution to the high availability of Oracle systems in demanding, international corporate settings.\nTechnical Skills — Oracle DBA \u0026amp; Performance Tuning # Oracle Database Administration: Oracle Database (from 8i to 21c, Autonomous), Oracle Exadata, Oracle RAC (Real Application Clusters), Oracle Data Guard, Oracle GoldenGate (basic knowledge). Advanced Performance Tuning: Analysis and Diagnostics: AWR, ADDM, ASH, Statspack, SQL Trace, TKPROF, Explain Plan. SQL Tuning: Optimization of complex queries, Hints, SQL Profiles, SQL Plan Management (SPM). Instance Tuning: Memory Management (SGA, PGA), Initialization Parameters, Wait Events Analysis. Database Design for Performance: Indexing (B-tree, Bitmap, Function-based), Partitioning (Range, List, Hash, Composite), Compression. High Availability and Disaster Recovery: RMAN (Backup, Recovery, Cloning), Oracle Data Guard, Flashback Technologies. Database Security: Oracle TDE (Transparent Data Encryption), User and Privilege Management, Auditing. Storage Management: ASM (Automatic Storage Management), Tablespace Management. Installation, Patching, and Migrations: Installation of new instances, application of PSU/CPU/RU, version upgrades, cross-platform migrations. Oracle Tools: Oracle Enterprise Manager (OEM) Cloud Control, SQL Developer, SQL*Plus, Toad. Scripting: PL/SQL, SQL, Unix Shell Scripting (for DBA task automation). Cloud: Oracle Cloud Infrastructure (OCI) — Compute, Storage, Networking, Database Services (VM DB, Bare Metal, Exadata CS, Autonomous Database). Other Database Technologies: PostgreSQL (administration, performance tuning, partitioning, replication), MySQL (administration, configuration, replication, InnoDB optimization). Operating Systems: Linux (Red Hat, Oracle Linux), Unix (AIX, Solaris), Windows Server. Work Experience #IDEA DB CONSULTING S.R.L. — Rome, Italy (Full Remote Europe) #Senior Oracle DBA \u0026amp; Performance Tuning Expert / DWH Architect | Sole Director | 2021 – Present\nMySQL \u0026amp; PostgreSQL DBA (for POSTE ITALIANE) | Jul 2025 – Present:\nAdministration and management of approximately 1,500 MySQL and PostgreSQL instances across production, staging, and development environments. Performance monitoring, query tuning, replication management, and capacity planning at enterprise scale. Oracle DBA \u0026amp; Tuning Expert (for GENERALI Assicurazioni) | Feb 2024 – May 2025:\nOracle database administration and advanced performance tuning for insurance-sector applications. AWR/ADDM analysis, SQL optimization, and proactive bottleneck resolution on databases ranging from 500GB to 8TB. Oracle DBA (for ATRADIUS) | 2022 – 2026:\nDatabase administration and performance tuning supporting the Surety division Data Warehouse consolidating data from Italy, Spain, France, and Northern European countries. Management of Oracle databases in OCI environments, supporting over 60K lines of PL/SQL ETL code with full daily ingestion completing in under 2 hours. Oracle DBA (for FAI SERVICE) | 2021 – 2023:\nAdministration and tuning of Oracle 19c databases in OCI supporting ETL processes and Oracle Analytics Cloud dashboards. DBA \u0026amp; Performance Tuning (Banking, Telepass, and other clients):\nOracle database optimization in OCI and Autonomous Database environments — reduced batch processing times from 4 hours to under 30 minutes on critical analytical workloads. Designed and developed ETL/ELT flows integrating data from 15+ heterogeneous sources across datasets exceeding 2 billion rows. RMAN management, Oracle OEM monitoring, installation and patching of Oracle databases. Designed and implemented DWH solutions on PostgreSQL as cost-effective alternatives to Oracle architectures. NIMIS CONSULTING S.R.L. — Rome, Italy (Full Remote) #Senior Oracle DBA \u0026amp; Performance Tuning Expert (for TIM / HUAWEI) | 2020 – 2022\nAdministration and management of over 30 critical Oracle databases (70+ instances) on Oracle Exadata clusters (3/5 nodes) for a leading Telco client. Direct responsibility for advanced performance tuning: AWR/ADDM analysis, SQL optimization, index management, partitioning, and compression. Involvement in storage management (ASM) activities and Oracle TDE implementation for data security. 24/7 \u0026ldquo;ON-CALL\u0026rdquo; support for resolving critical issues and maintaining high availability. FREELANCE / INDEPENDENT CONSULTANT — Rome, Italy (Full Remote Europe) #Senior Oracle DBA \u0026amp; Performance Tuning Expert / DWH Architect | 2013 – 2020\nProvision of consulting services as Oracle DBA and Performance Tuning specialist for various clients. Optimization of complex SQL queries and tuning of Oracle instances to improve the performance of critical applications. Support in installation, configuration, patching, and upgrading Oracle databases. Consultancy on backup and recovery strategies with RMAN. Design and implementation of Oracle Data Guard configurations for high availability and disaster recovery, including switchover/failover procedures. Management of small technical teams in migration and upgrade projects. AUSELDA AED GROUP S.P.A. — Rome, Italy #Project Oracle DBA / Performance \u0026amp; Tuning Expert (for Public Administration) | 2009 – 2013\nAdministration and optimization of Oracle databases supporting Public Administration applications. Tuning of SQL queries and ETL processes for Data Warehousing systems. Involvement in installation, patching, and database security management activities. ORACLE ITALIA S.R.L. — Various Locations, Italy \u0026amp; Madrid, Spain #Oracle DBA / DWH Architect / SQL \u0026amp; PL/SQL Developer / Training Specialist | 1999 – 2009\nDBA and Performance Roles (progressively increasing): Administration of Oracle databases for enterprise clients (TIM, Vodafone, Banca d\u0026rsquo;Italia, Generali). Involvement in SQL Performance \u0026amp; Tuning, troubleshooting, and optimization activities. Support for installation and configuration of Oracle instances, patch management. As Training Specialist (2000-2001): Delivered courses on Oracle Database Administration and Performance \u0026amp; Tuning. S.EL.DAT. S.P.A. — Rome, Italy #Software Developer (for Telecom, Rover Italia) | 1997 – 1999\nClient-server application development, Junior Oracle DBA, DB monitoring. Education and Training # Faculty of Computer Engineering (Software Engineering) | University of Rome Tre, Rome | 1994 – 2000 Scientific High School Diploma (EQF Level 4) | Liceo Scientifico Isacco Newton / Manieri Copernico, Rome | 1988 – 1993 Advanced English (C1/C2) | The British Council (Level 4A), Rome | 2003 – 2004 Selected Training and Continuing Education Courses:\nAdvanced SQL for Query Tuning and Performance Optimization — LinkedIn Learning (August 2020) Data Wrangling, Analysis and AB Testing with SQL — Coursera, University of California Davis (April 2021) Multiple Oracle 12c specialization courses (Administration, Security, Backup and Recovery, Advanced SQL, New Features) — LinkedIn Learning (2020) Languages # Italian: Native English: C1/C2 (Fluent, professional) Spanish: C1 (Fluent) Romanian: C1 (Fluent) French: A1/A2 (Basic) Soft Skills # Analytical and Methodical Problem Solving Priority Management and Adherence to Deadlines Ability to Work Under Pressure Clear and Effective Technical Communication Attention to Detail and Precision Continuous Learning and Technological Adaptability I consent to the processing of my personal data pursuant to Art. 13 of EU Regulation 2016/679 (GDPR).\nRome, March 2026\nDownload PDF | LinkedIn Profile | Back to previous page\n","date":"10 March 2026","permalink":"https://ivanluminaria.com/en/resumes/oracle-dba/","section":"Know-How \u0026 Impact","summary":"\u003cp\u003e\u003cstrong\u003e\u003ca href=\"https://ivanluminaria.com/downloads/CV_Oracle_DBA_Ivan_Luminaria_202603_EN.pdf\" target=\"_blank\" rel=\"noreferrer\"\u003eDownload PDF\u003c/a\u003e\u003c/strong\u003e | \u003cstrong\u003e\u003ca href=\"https://www.linkedin.com/in/ivanluminaria\" target=\"_blank\" rel=\"noreferrer\"\u003eLinkedIn Profile\u003c/a\u003e\u003c/strong\u003e\u003c/p\u003e\n\u003chr\u003e\n\u003ch2 id=\"professional-profile\" class=\"relative group\"\u003eProfessional Profile \u003cspan class=\"absolute top-0 w-6 transition-opacity opacity-0 -start-6 not-prose group-hover:opacity-100\"\u003e\u003ca class=\"group-hover:text-primary-300 dark:group-hover:text-neutral-700\" style=\"text-decoration-line: none !important;\" href=\"#professional-profile\" aria-label=\"Anchor\"\u003e#\u003c/a\u003e\u003c/span\u003e\u003c/h2\u003e\u003cp\u003eHighly skilled Senior Oracle DBA and Performance Tuning Expert with nearly 30 years of specialized experience in the administration, optimization, and management of complex, mission-critical Oracle databases, including Exadata, RAC, and Oracle Cloud (OCI, Autonomous Database) environments. Deep expertise in performance analysis (AWR, ADDM, ASH), tuning complex SQL queries, instance optimization, and resolving performance issues. Foundational knowledge of practices related to backup \u0026amp; recovery (RMAN), security (TDE), installation, patching, migrations, and storage management (ASM), with performance tuning remaining the primary and in-depth area of expertise, crucial for the integrity, performance, and contribution to the high availability of Oracle systems in demanding, international corporate settings.\u003c/p\u003e","title":"Oracle DBA \u0026 Performance Tuning Expert"},{"content":"Download PDF | LinkedIn Profile\nProfessional Profile #Senior Oracle PL/SQL Developer with nearly 30 years of experience in designing, developing, testing, and optimizing robust and efficient PL/SQL code for data-intensive applications and Data Warehouse systems. Deep expertise in creating complex packages, procedures, functions, triggers, and types, with a constant focus on performance, maintainability, and code quality. Expert in optimizing complex SQL queries and managing large volumes of data. Skilled in translating business requirements into high-performing and scalable application logic within the Oracle database. Consolidated background also as an Oracle DBA and DWH Architect, providing a comprehensive view of the data lifecycle.\nTechnical Skills — Oracle PL/SQL Development # Languages: PL/SQL (Advanced), SQL (Advanced, including Dynamic SQL, Analytic Functions, CTEs). PL/SQL Development: Packages, Procedures, Functions, Triggers. PL/SQL Data Types (Records, Collections, Object Types). Error and Exception Handling. Bulk Processing (FORALL, BULK COLLECT). Dynamic SQL (DBMS_SQL, Execute Immediate). PL/SQL Code Optimization (incl. use of PL/SQL Hierarchical Profiler). Interaction with Database Objects (Tables, Views, Sequences, Synonyms). SQL Optimization \u0026amp; Performance: Execution Plan Analysis (Explain Plan), SQL Trace, TKPROF. SQL Tuning Techniques (Hints, Query Rewriting, Indexes). Understanding the impact of database design on PL/SQL performance. Development Tools: SQL Developer, Toad, SQL*Plus. Oracle Database: Oracle Database (from 8i to 21c, Autonomous Database). Related Concepts: Data Warehousing (ETL/ELT logic development), Data Integration, Relational and Multidimensional Data Modeling. Version Control: Git, GitHub. Scripting: Unix Shell Scripting (for deployment automation and script management). Cloud: Oracle Cloud Infrastructure (OCI) — knowledge of database services. Work Experience #IDEA DB CONSULTING S.R.L. — Rome, Italy (Full Remote Europe) #Senior Oracle PL/SQL Developer \u0026amp; DWH Architect | 2022 – Present\nPL/SQL Developer (for ATRADIUS) | 2022 – 2026:\nDeveloped and maintained over 60,000 lines of PL/SQL code (packages, procedures, functions) for the Surety division Data Warehouse, consolidating insurance claims and credit data from Italy, Spain, France, and Northern European countries. Designed reusable PL/SQL templates for data loading procedures — standardized the internal ETL workflow with built-in checkpoints for real-time tracking and monitoring of data flows, enabling junior developers to follow a consistent, repeatable pattern. Implemented real-time load monitoring via custom PL/SQL logging packages, providing instant visibility on ETL pipeline status and error handling across all loading stages. Optimized batch processing performance: reduced full daily ingestion cycle from 4+ hours to under 2 hours through query rewriting, BULK COLLECT/FORALL patterns, and partition-aware DML. PL/SQL Developer (for FINWAVE S.p.A.) | 2020 – 2022:\nDeveloped PL/SQL packages for financial transaction processing applications handling millions of daily operations across banking and insurance clients. Advanced query optimization and PL/SQL code tuning for high-volume financial data pipelines. PL/SQL Developer (for FAI SERVICE) | 2021 – 2023:\nDeveloped ETL procedures in PL/SQL on Oracle 19c/OCI for billing statistics, customer segmentation, and cost/revenue tracking data flows. Built PL/SQL modules feeding Oracle Analytics Cloud dashboards with aggregated financial KPIs. PL/SQL Development (Banking, Telepass, and other clients):\nDeveloped PL/SQL business logic packages for banking-sector DWH applications, processing datasets exceeding 2 billion rows. Optimization of existing PL/SQL code and SQL queries — applied Hierarchical Profiler analysis to identify bottlenecks and improve critical path execution times. Collaborated with development teams and functional analysts for requirements definition and PL/SQL-based solution design. NIMIS CONSULTING S.R.L. — Rome, Italy (Full Remote) #Senior Oracle DBA \u0026amp; Performance Expert with Development Focus (for TIM / HUAWEI) | 2020 – 2022\nProvided specialist support to development teams in optimizing PL/SQL code and SQL queries for critical applications on Exadata databases. Analysis and tuning of high-volume PL/SQL batch processes. Development of PL/SQL scripts for monitoring and administration tasks. FREELANCE / INDEPENDENT CONSULTANT — Rome, Italy (Full Remote Europe) #Senior Oracle PL/SQL Developer \u0026amp; DBA / DWH Architect | 2013 – 2020\nDesign and development of custom PL/SQL solutions for various clients, including packages for ETL logic, data processing procedures, and PL/SQL APIs. Intensive optimization of PL/SQL and SQL code to improve the performance of existing systems. Development of PL/SQL modules for extracting, transforming, and loading (ETL) data into DWH systems. Training and mentoring junior developers on PL/SQL development best practices. AUSELDA AED GROUP S.P.A. — Rome, Italy #Oracle PL/SQL Developer \u0026amp; DWH Specialist (for Public Administration) | 2009 – 2013\nDevelopment of PL/SQL components for Data Warehousing systems and management applications for Public Administration. Evolutionary and corrective maintenance of PL/SQL code. Optimization of ETL processes based on PL/SQL and OWB. ORACLE ITALIA S.R.L. — Various Locations, Italy \u0026amp; Madrid, Spain #SQL \u0026amp; PL/SQL Developer / DWH Architect / DBA / Training Specialist | 1999 – 2009\nPL/SQL Development Roles (significant and increasing): Intensive PL/SQL code development for DWH, BI, and custom application projects for enterprise clients in various sectors (Telco, Finance, Pharmaceutical). Creation of PL/SQL packages for complex business logic, data loading procedures (ETL) with Oracle Warehouse Builder and PL/SQL. Development of BI Reports and HTMLDB (Apex) interfaces with PL/SQL logic. As Training Specialist (2000-2001): Delivered Oracle SQL (Basic and Advanced) and PL/SQL (Basic and Advanced) courses. ETNOTEAM S.P.A. — Rome, Italy #Web Developer / Oracle SQL \u0026amp; PL/SQL Developer | 1999\nDevelopment of web portals and client-server applications with strong interaction with Oracle databases, using SQL and PL/SQL for backend logic. S.EL.DAT. S.P.A. — Rome, Italy #Software Developer / Junior DBA | 1997 – 1999\nDevelopment of client-server applications with Oracle backend; initial experiences with SQL and PL/SQL. Education and Training # Faculty of Computer Engineering (Software Engineering) | University of Rome Tre, Rome | 1994 – 2000 Scientific High School Diploma (EQF Level 4) | Liceo Scientifico Isacco Newton / Manieri Copernico, Rome | 1988 – 1993 Advanced English (C1/C2) | The British Council (Level 4A), Rome | 2003 – 2004 Selected Training and Continuing Education Courses:\nScrum Agile — Randstad/Forma.temp (May 2024) Project Management — Randstad/Forma.temp (May 2024) Data Wrangling, Analysis and AB Testing with SQL — Coursera, University of California Davis (April 2021) Advanced SQL for Query Tuning and Performance Optimization — LinkedIn Learning (August 2020) Multiple Oracle 12c specialization courses (Advanced SQL, New Features) — LinkedIn Learning (2020) Learning Git and GitHub — LinkedIn Learning (August 2020) Languages # Italian: Native English: C1/C2 (Fluent, professional) Spanish: C1 (Fluent) Romanian: C1 (Fluent) French: A1/A2 (Basic) Soft Skills # Analytical Approach and Orientation to Solving Complex Problems Writing Clear, Efficient, and Maintainable Code Advanced Debugging and Troubleshooting Skills Excellent Understanding of Functional and Technical Requirements Effective Collaboration in Development Teams Attention to Detail and Software Quality I consent to the processing of my personal data pursuant to Art. 13 of EU Regulation 2016/679 (GDPR).\nRome, March 2026\nDownload PDF | LinkedIn Profile | Back to previous page\n","date":"10 March 2026","permalink":"https://ivanluminaria.com/en/resumes/oracle-plsql/","section":"Know-How \u0026 Impact","summary":"\u003cp\u003e\u003cstrong\u003e\u003ca href=\"https://ivanluminaria.com/downloads/CV_Oracle_PLSQL_Ivan_Luminaria_202603_EN.pdf\" target=\"_blank\" rel=\"noreferrer\"\u003eDownload PDF\u003c/a\u003e\u003c/strong\u003e | \u003cstrong\u003e\u003ca href=\"https://www.linkedin.com/in/ivanluminaria\" target=\"_blank\" rel=\"noreferrer\"\u003eLinkedIn Profile\u003c/a\u003e\u003c/strong\u003e\u003c/p\u003e\n\u003chr\u003e\n\u003ch2 id=\"professional-profile\" class=\"relative group\"\u003eProfessional Profile \u003cspan class=\"absolute top-0 w-6 transition-opacity opacity-0 -start-6 not-prose group-hover:opacity-100\"\u003e\u003ca class=\"group-hover:text-primary-300 dark:group-hover:text-neutral-700\" style=\"text-decoration-line: none !important;\" href=\"#professional-profile\" aria-label=\"Anchor\"\u003e#\u003c/a\u003e\u003c/span\u003e\u003c/h2\u003e\u003cp\u003eSenior Oracle PL/SQL Developer with nearly 30 years of experience in designing, developing, testing, and optimizing robust and efficient PL/SQL code for data-intensive applications and Data Warehouse systems. Deep expertise in creating complex packages, procedures, functions, triggers, and types, with a constant focus on performance, maintainability, and code quality. Expert in optimizing complex SQL queries and managing large volumes of data. Skilled in translating business requirements into high-performing and scalable application logic within the Oracle database. Consolidated background also as an Oracle DBA and DWH Architect, providing a comprehensive view of the data lifecycle.\u003c/p\u003e","title":"Oracle PL/SQL Developer"},{"content":"Download PDF | LinkedIn Profile\nProfessional Profile #Project Manager with nearly 30 years of IT experience and a solid technical background in Data Warehouse and Oracle environments. Over 10 projects delivered with a strong on-time track record, managing teams of 3-7 people in multicultural, remote settings on projects valued in the €100K-€500K range. Hands-on expertise in planning, risk management, resource coordination and stakeholder engagement — developed through years of leading development, release and maintenance activities for clients in Banking, Telco, Insurance and Public Administration. Agile (Scrum) methodologies applied in day-to-day project management, backed by certified training. Deep technical background (DWH architecture, Oracle DBA, PL/SQL, ETL/ELT) that enables effective communication with development teams, realistic feasibility assessments and early identification of technical risks.\nKey Skills # Project Management \u0026amp; Methodologies: Project planning, milestone definition and progress tracking. Risk and issue management — proactive identification and mitigation planning. Agile and Scrum (Sprint Planning, Daily Stand-up, Retrospective, Backlog Refinement) — Certified training. Requirements definition, scope management and stakeholder engagement. Release planning and delivery coordination. Project Management \u0026amp; Productivity Tools: Microsoft Project and Jira (planning, tracking, backlog management). Microsoft Excel and Google Sheets (reporting, project dashboards, data analysis — daily use). Git and GitHub (code management, issue tracking, CI/CD, collaborative workflows). Technical Skills Supporting PM: Data Warehouse Architecture (Kimball, Inmon, Google Cloud Platform). Oracle Database: DBA, Performance Tuning, PL/SQL. Advanced SQL, ETL/ELT processes, Business Intelligence. Technical feasibility assessment and development effort estimation. Leadership \u0026amp; Communication: Coordination of distributed technical teams (3-7 people, multi-country). Effective communication with technical teams, business stakeholders and management. Mentoring and team skill development. Meeting facilitation and conflict management. Work Experience #IDEA DB CONSULTING S.R.L. — Rome, Italy (Full Remote Europe) #Project Manager \u0026amp; Senior DWH Architect | 2022 – Present\nPM \u0026amp; DWH Lead (for GENERALI Insurance) | Feb 2024 – May 2025:\nCoordinated project activities and managed priorities for the development team on Oracle databases in the insurance sector. Direct client interface for requirements gathering, scope definition and solution presentation. Progress monitoring and issue management across databases ranging from 500GB to 8TB. PM \u0026amp; DWH Lead (for ATRADIUS) | 2022 – 2026:\nManaged the data consolidation project for the Surety division — coordinating data integration from 4 European countries (Italy, Spain, France, Northern Europe). Release planning and development backlog management across 60,000+ lines of PL/SQL ETL code. Activity tracking, stakeholder reporting and cross-team dependency management. Project Coordinator (Banking, Telepass and other clients):\nCoordinated projects achieving batch processing time reduction from 4 hours to under 30 minutes. Managed data integration from 15+ heterogeneous sources across datasets exceeding 2 billion rows. Release planning and oversight in Oracle OCI and Autonomous Database environments. NIMIS CONSULTING S.R.L. — Rome, Italy (Full Remote) #Senior Oracle DBA \u0026amp; Performance Tuning Expert (for TIM / HUAWEI) | 2020 – 2022\nPlanned and executed maintenance and patching activities across 30+ critical Oracle databases (70+ instances) on Exadata clusters. Coordinated with the development team to optimize database interactions and resolve performance issues — specialized technical support and intervention prioritization. Autonomous workload management and tuning activities, with regular reporting to the project lead. FREELANCE / INDEPENDENT CONSULTANT — Rome, Italy (Full Remote Europe) #Project Manager \u0026amp; Senior DWH Consultant | 2013 – 2020\nManaged approximately 10 projects over 7 years for clients in Banking, Telco and services sectors, with budgets in the €100K-€500K range and a strong on-time delivery record. Coordinated teams of 3-7 people in multicultural, distributed settings — task assignment, collaboration facilitation and priority management using an iterative Agile approach. Direct client interface for requirements gathering, scope definition, progress reporting and expectation management. Technical training and mentoring for team members — skill development and onboarding of new consultants. AUSELDA AED GROUP S.P.A. — Rome, Italy #Senior Data Warehouse Specialist \u0026amp; Oracle DBA (for Public Administration) | 2009 – 2013\nCollaborated with project leads on technical requirements definition, activity planning and solution validation. Coordinated development and optimization activities for Data Warehouses serving Public Administration entities. Technical support and user training on DWH platforms. ORACLE ITALIA S.R.L. — Various Locations, Italy \u0026amp; Madrid, Spain #Senior Consultant / DWH Specialist / Training Specialist | 1999 – 2009\nParticipated in complex DWH and BI implementation projects for major clients (Telco, Finance, Pharmaceutical), with increasing responsibilities in activity coordination and junior consultant support. Managed relationships with client technical leads and provided progress reporting. As Training Specialist (2000-2001): Delivered Oracle technical courses, managed classroom dynamics and adapted content — presentation and training skills directly applicable to the PM role. Education and Training # Faculty of Computer Engineering (Software Engineering) | University of Rome Tre, Rome | 1994 – 2000 Scientific High School Diploma (EQF Level 4) | Liceo Scientifico Isacco Newton / Manieri Copernico, Rome | 1988 – 1993 Advanced English (C1/C2) | The British Council (Level 4A), Rome | 2003 – 2004 Selected Training and Continuing Education Courses:\nProject Management — Randstad/Forma.temp (May 2024) Scrum Agile — Randstad/Forma.temp (May 2024) Data Wrangling, Analysis and AB Testing with SQL — Coursera, University of California Davis (April 2021) Data Science on Google Cloud Platform: Designing Data Warehouses — LinkedIn Learning (September 2020) Multiple Oracle 12c specialization courses (Administration, Security, Backup and Recovery, Advanced SQL, Performance Optimization) — LinkedIn Learning (2020) Learning Excel 2016 — LinkedIn Learning (August 2020) Learning Git and GitHub — LinkedIn Learning (August 2020) Languages # Italian: Native English: C1/C2 (Fluent, professional) Spanish: C1 (Fluent) Romanian: C1 (Fluent) French: A1/A2 (Basic) Soft Skills # Ability to translate technical requirements into concrete, manageable project plans Simultaneous management of multiple projects with competing priorities Effective communication at all levels (technical team, stakeholders, management) Analytical Problem Solving and proactive risk management Experience working in distributed, multicultural and fully remote teams Mentoring and team skill development I consent to the processing of my personal data pursuant to Art. 13 of EU Regulation 2016/679 (GDPR).\nRome, March 2026\nDownload PDF | LinkedIn Profile | Back to previous page\n","date":"10 March 2026","permalink":"https://ivanluminaria.com/en/resumes/project-manager/","section":"Know-How \u0026 Impact","summary":"\u003cp\u003e\u003cstrong\u003e\u003ca href=\"https://ivanluminaria.com/downloads/CV_Project_Manager_Ivan_Luminaria_202603_EN.pdf\" target=\"_blank\" rel=\"noreferrer\"\u003eDownload PDF\u003c/a\u003e\u003c/strong\u003e | \u003cstrong\u003e\u003ca href=\"https://www.linkedin.com/in/ivanluminaria\" target=\"_blank\" rel=\"noreferrer\"\u003eLinkedIn Profile\u003c/a\u003e\u003c/strong\u003e\u003c/p\u003e\n\u003chr\u003e\n\u003ch2 id=\"professional-profile\" class=\"relative group\"\u003eProfessional Profile \u003cspan class=\"absolute top-0 w-6 transition-opacity opacity-0 -start-6 not-prose group-hover:opacity-100\"\u003e\u003ca class=\"group-hover:text-primary-300 dark:group-hover:text-neutral-700\" style=\"text-decoration-line: none !important;\" href=\"#professional-profile\" aria-label=\"Anchor\"\u003e#\u003c/a\u003e\u003c/span\u003e\u003c/h2\u003e\u003cp\u003eProject Manager with nearly 30 years of IT experience and a solid technical background in Data Warehouse and Oracle environments. Over 10 projects delivered with a strong on-time track record, managing teams of 3-7 people in multicultural, remote settings on projects valued in the €100K-€500K range. Hands-on expertise in planning, risk management, resource coordination and stakeholder engagement — developed through years of leading development, release and maintenance activities for clients in Banking, Telco, Insurance and Public Administration. Agile (Scrum) methodologies applied in day-to-day project management, backed by certified training. Deep technical background (DWH architecture, Oracle DBA, PL/SQL, ETL/ELT) that enables effective communication with development teams, realistic feasibility assessments and early identification of technical risks.\u003c/p\u003e","title":"Project Manager"},{"content":"Monday morning. Alarm at 6:40. Shower, quick breakfast, car keys on the table. I leave the house at 7:15.\nI live in the Appio Latino neighbourhood. The office is on Via Crescenzio, in Prati. Eight kilometres as the crow flies. Should take fifteen minutes. In Rome, it\u0026rsquo;s a different story.\n🚗 The morning by car #Via Appia Nuova is already a car park. Porta San Giovanni, a funnel. Lungotevere, a funeral procession with horns.\nFifty minutes to cover eight kilometres. Fifty minutes of clutch, traffic lights, double-parking, scooters cutting in and buses stopping in the second lane.\nBut the best part is yet to come.\nI arrive in Prati and the parking hunt begins. Via Crescenzio, full. Via Tacito, full. The side streets, full. I circle for an hour and a half. An hour and a half. Crawling through Prati\u0026rsquo;s streets with the engine running and my patience dropping below zero.\nIn the end, exhausted, I surrender. Piazza Cavour multi-storey car park. \u0026ldquo;At least I\u0026rsquo;ll find a spot here,\u0026rdquo; I think.\nI find a spot. I also find the bill: that evening, when I leave, the receipt says €35.\nThirty-five euros for the privilege of leaving my car sitting still all day.\nLet\u0026rsquo;s take stock of that morning:\nItem Value Time leaving home 7:15 Time in traffic 50 minutes Time looking for parking 1 hour 30 minutes Time arriving at desk 9:35 Parking cost €35 Stress level ████████████ 200% First hour productivity close to zero Two hours and twenty minutes. To travel eight kilometres. And €35 lighter.\nThat evening I go home thinking: \u0026ldquo;There must be another way.\u0026rdquo;\n🚲 The following week: the electric Brompton #The next Monday I change everything.\nI leave the house at 7:30 — fifteen minutes later than the week before. I go downstairs with my electric Brompton. I unfold it in ten seconds on the pavement. Helmet, backpack, off I go.\nSame route. Appio Latino → San Giovanni → Celio → Lungotevere → Prati.\nBut this time everything is different.\nWhile cars are stuck in queues, I ride past. While scooters slalom between bumpers, I pedal calmly in the cycle lane. I don\u0026rsquo;t sweat — it\u0026rsquo;s an electric bike, the pedal assist does its job. I don\u0026rsquo;t stress — nobody\u0026rsquo;s honking at me. I don\u0026rsquo;t look for parking — I fold the Brompton and carry it into the office.\nEighteen minutes. Door to door.\nItem Value Time leaving home 7:30 Time cycling 18 minutes Time looking for parking 0 minutes Time arriving at desk 7:50 Cost €0 Stress level ░░░░░░░░░░░░ 0% First hour productivity maximum At 7:50 I was sitting at my desk. Fresh. Alert. Coffee in hand and my head already on the first task of the day.\nOne hour and forty-five minutes earlier than the previous week.\nThirty-five euros saved.\nAnd above all: zero stress.\n📊 The numbers over a full year #Let\u0026rsquo;s do the real maths. Over 220 working days:\n🚗 Car 🚲 Brompton Average door-to-door time (one way) 50 min + 30 min parking 18 min Daily time (round trip) ~2h 40min ~36 min Annual travel time ~587 hours ~132 hours Hours saved by bike ~455 hours Annual fuel cost ~€1,800 €0 Parking/ZTL/fines ~€1,200 (conservative) €0 Insurance + road tax + wear ~€2,500 €0 Annual mobility cost ~€5,500 ~€50 (maintenance) Annual savings ~€5,450 455 hours saved. That\u0026rsquo;s 57 working days. Two and a half months of life given back.\nAnd those €5,450 saved? That\u0026rsquo;s a holiday. A pension contribution. The cost of the Brompton itself, paid off in less than a year.\n💪 The benefits you can\u0026rsquo;t measure in euros #But the financial numbers are only part of the story. The ones that truly matter are the others.\nCardiovascular health #Cycling 36 minutes a day, even with pedal assist, counts as moderate physical activity. The World Health Organisation recommends at least 150 minutes of aerobic activity per week. By cycling to work, you get 180 without even thinking about it.\nStudies published in the British Medical Journal show that cycle commuters have:\n41% lower risk of death from all causes 52% lower risk of death from cardiovascular disease 45% lower risk of developing cancer These aren\u0026rsquo;t my numbers. They\u0026rsquo;re science\u0026rsquo;s numbers, based on a sample of over 250,000 British commuters tracked over five years.\nMental health #Rome\u0026rsquo;s traffic isn\u0026rsquo;t just boring. It\u0026rsquo;s toxic for the mind. Chronic commuting stress is linked to:\nelevated cortisol levels sleep disorders increased irritability and anxiety reduced ability to concentrate Cycling, on the other hand, releases endorphins. You arrive at work with an oxygenated brain, a good mood, and a sense of autonomy that a car seat in traffic will never give you.\nAir quality #A car stuck in Rome\u0026rsquo;s traffic produces on average 120–150 g of CO₂ per kilometre. In congested traffic, even more — because the engine idles, consuming fuel without moving.\nA bike produces zero emissions.\nIf just 10% of Roman commuters switched to daily cycling, it would save around 150,000 tonnes of CO₂ per year. That\u0026rsquo;s the equivalent of planting 7 million trees.\nIt\u0026rsquo;s not idealism. It\u0026rsquo;s arithmetic.\n🤔 \u0026ldquo;But it rains, it\u0026rsquo;s hot, it\u0026rsquo;s dangerous\u0026hellip;\u0026rdquo; #I know. I\u0026rsquo;ve heard every objection. I\u0026rsquo;ve made them all myself.\n\u0026ldquo;What about when it rains?\u0026rdquo;\nIt rains about 75 days a year in Rome. On heavy rain days, I take the metro from San Giovanni to Lepanto. Fifteen minutes. No car even in a downpour. The Brompton folds up and rides the metro with me.\n\u0026ldquo;In summer it\u0026rsquo;s too hot.\u0026rdquo;\nWith pedal assist you don\u0026rsquo;t sweat. And even if you did slightly, 18 minutes of open air beats 50 minutes in a scorching cabin with air conditioning drying out your throat.\n\u0026ldquo;Rome\u0026rsquo;s roads aren\u0026rsquo;t safe.\u0026rdquo;\nThis is true, and I don\u0026rsquo;t minimise it. Rome needs more cycling infrastructure. But my route — Appio Latino, Celio, Lungotevere — is reasonably safe, especially during rush hour when traffic is so slow that cars move slower than bikes.\n\u0026ldquo;I can\u0026rsquo;t bring a bike into the office.\u0026rdquo;\nThe Brompton folds in 20 seconds and becomes luggage you can slide under your desk. That\u0026rsquo;s its superpower: it completely eliminates the parking problem.\n🌍 It\u0026rsquo;s not just a personal choice #Every person who leaves the car at home and takes the bike:\nfrees up a parking space for someone who truly needs it reduces traffic for those who must drive improves air quality for everyone reduces noise pollution in the neighbourhood proves that another model is possible I\u0026rsquo;m not asking anyone to sell their car. I\u0026rsquo;m saying that for many urban journeys — those under 10 km — the bike is objectively superior to the car. Faster, cheaper, healthier, more sustainable.\nAnd with a folding electric bike, the last excuses fall one by one.\n🇪🇺 They\u0026rsquo;re already doing it in Europe — and it works #While in Rome we debate whether cycling to work is even possible, half of Europe has been doing it for decades.\nAmsterdam has more bicycles than inhabitants — 881,000 bikes for 872,000 residents. The problem isn\u0026rsquo;t convincing people to cycle, but where to park all those bikes. At Central Station they built the world\u0026rsquo;s largest bicycle parking facility: 12,500 spaces across three underground levels. Twelve thousand five hundred. Not car spaces. Bike spaces.\nCopenhagen has reached a historic milestone: over 60% of residents cycle to work. Not out of ideology, but practicality. The average commute takes 13 minutes. Try doing the same by car.\nMunich, Berlin, Vienna — cities with far harsher winters than Rome — have comprehensive cycling networks and urban cyclist rates that Rome can only dream of.\nAnd you don\u0026rsquo;t need to look beyond the Alps. Milan, Bologna, Ferrara, Padua — in northern Italy, cycling to work is normal. It\u0026rsquo;s not heroism. It\u0026rsquo;s not eccentricity. It\u0026rsquo;s common sense.\nThe seven hills? With an e-bike, they\u0026rsquo;re no longer an excuse #I know what you\u0026rsquo;re thinking. \u0026ldquo;Yes, but those are flat cities. Rome has seven hills.\u0026rdquo;\nIt\u0026rsquo;s true. Rome has climbs. The Celio, the Aventine, the Janiculum — they\u0026rsquo;re not exactly the Po Valley.\nBut that objection made sense ten years ago. Today, with an electric bike, the seven hills no longer exist. The motor assists you uphill, you reach the top without gasping, without sweating, without missing the car.\nMy electric Brompton tackles the Celio climb as if it were a gentle bump. The Lungotevere is flat. And the final stretch to Prati is downhill.\nRome has 300 sunny days a year, a mild climate even in winter, and compact urban distances. It is — paradoxically — one of the Italian cities most suited to cycling. All we lack is the infrastructure. And the courage to change habits.\n🏠 Bike and smart working: the perfect combination #There\u0026rsquo;s a deep connection between choosing the bike and the smart working philosophy. Both start from the same question: \u0026ldquo;Does what I\u0026rsquo;m doing make sense, or am I doing it just because it\u0026rsquo;s always been done this way?\u0026rdquo;\nSmart working eliminates the commute on days when you don\u0026rsquo;t need to be in the office. The bike makes the commute smart on the days when you do.\nThe ideal model? 3 days remote, 2 in the office — by bike.\nModel Weekly travel hours Stress Cost 5 days by car ~13 hours high ~€110/week 5 days car + smart working (3+2) ~5 hours medium ~€44/week 2 days bike + 3 smart working ~1.2 hours zero ~€0 From 13 hours to 1 hour and 12 minutes. From €110 to zero.\nIt\u0026rsquo;s not utopia. It\u0026rsquo;s organisation.\nIt\u0026rsquo;s not laziness. It\u0026rsquo;s intelligence.\nIf companies combined smart working with sustainable mobility incentives, the result would be threefold: healthier employees, higher productivity, and more liveable cities. But to get there, we need to stop thinking that \u0026ldquo;working\u0026rdquo; means \u0026ldquo;sitting in an office from 9 to 6 after an hour in the car.\u0026rdquo;\n🎯 My personal balance sheet #Since I made the switch, my morning looks like this:\nTime Activity 7:00 Wake up 7:00 — 7:25 Relaxed breakfast, news 7:30 Leave home on the Brompton 7:48 Arrive at office, bike folded under the desk 7:50 Operational No stress. No costs. No €35 surprises.\nAnd in the evening, same thing in reverse: 18 minutes and I\u0026rsquo;m home. Not an hour. Not \u0026ldquo;depends on traffic.\u0026rdquo; Eighteen minutes, every time.\nI\u0026rsquo;ve reclaimed time. I\u0026rsquo;ve reclaimed money. I\u0026rsquo;ve reclaimed mental energy.\nBut above all, I\u0026rsquo;ve reclaimed the pleasure of moving through the city instead of enduring it.\n💬 To those still stuck in traffic #If every morning you spend an hour in the car for a journey that would take twenty minutes by bike.\nIf every evening you come home drained, not from work, but from getting there.\nIf you\u0026rsquo;ve ever calculated how much you spend on fuel, parking, and mental health.\nTry it. Even just for a week.\nGet a bike — folding, electric, whatever you prefer — and ride the same route.\nCheck your watch when you arrive. Check how you feel. Check your wallet at the end of the month.\nThe numbers speak for themselves.\nBut the smile on your face when you reach your desk — that\u0026rsquo;s priceless.\nGlossary #Brompton — British folding bicycle considered the world reference for build quality, folded compactness and practicality in urban commuting.\nPedal Assist — Electric propulsion system that amplifies the cyclist\u0026rsquo;s pedaling force, eliminating the problem of hills and sweat on urban commutes.\nFolding Bike — Bicycle that folds in 10-20 seconds becoming a portable package for the office, metro or train.\nCommuting — Daily home-to-work travel and back, which in large cities can absorb 2-4 hours per day and hundreds of euros per month.\nCarbon Footprint — Total amount of greenhouse gases emitted by an activity — a car in Roman traffic produces 120-150 g of CO₂ per km, a bike zero.\nSustainable Mobility — Approach to urban transport that favors low environmental impact means, reducing emissions, traffic and costs.\n","date":"3 March 2026","permalink":"https://ivanluminaria.com/en/posts/project-management/bici-vs-auto-roma/","section":"Database Strategy","summary":"\u003cp\u003eMonday morning. Alarm at 6:40. Shower, quick breakfast, car keys on the table. I leave the house at 7:15.\u003c/p\u003e\n\u003cp\u003eI live in the Appio Latino neighbourhood. The office is on Via Crescenzio, in Prati. Eight kilometres as the crow flies. Should take fifteen minutes. In Rome, it\u0026rsquo;s a different story.\u003c/p\u003e\n\u003chr\u003e\n\u003ch2 id=\"-the-morning-by-car\" class=\"relative group\"\u003e🚗 The morning by car \u003cspan class=\"absolute top-0 w-6 transition-opacity opacity-0 -start-6 not-prose group-hover:opacity-100\"\u003e\u003ca class=\"group-hover:text-primary-300 dark:group-hover:text-neutral-700\" style=\"text-decoration-line: none !important;\" href=\"#-the-morning-by-car\" aria-label=\"Anchor\"\u003e#\u003c/a\u003e\u003c/span\u003e\u003c/h2\u003e\u003cp\u003eVia Appia Nuova is already a car park. Porta San Giovanni, a funnel. Lungotevere, a funeral procession with horns.\u003c/p\u003e","title":"Bike vs Car in Rome: the morning that opened my eyes"},{"content":"","date":null,"permalink":"https://ivanluminaria.com/en/tags/cycling/","section":"Tags","summary":"","title":"Cycling"},{"content":"","date":null,"permalink":"https://ivanluminaria.com/en/tags/health/","section":"Tags","summary":"","title":"Health"},{"content":"","date":null,"permalink":"https://ivanluminaria.com/en/tags/smart-working/","section":"Tags","summary":"","title":"Smart-Working"},{"content":"","date":null,"permalink":"https://ivanluminaria.com/en/tags/sustainability/","section":"Tags","summary":"","title":"Sustainability"},{"content":"","date":null,"permalink":"https://ivanluminaria.com/en/tags/urban-mobility/","section":"Tags","summary":"","title":"Urban-Mobility"},{"content":"","date":null,"permalink":"https://ivanluminaria.com/en/tags/hugepages/","section":"Tags","summary":"","title":"Hugepages"},{"content":"","date":null,"permalink":"https://ivanluminaria.com/en/tags/kernel/","section":"Tags","summary":"","title":"Kernel"},{"content":"","date":null,"permalink":"https://ivanluminaria.com/en/tags/linux/","section":"Tags","summary":"","title":"Linux"},{"content":"","date":null,"permalink":"https://ivanluminaria.com/en/categories/oracle/","section":"Categories","summary":"","title":"Oracle"},{"content":"","date":null,"permalink":"https://ivanluminaria.com/en/tags/oracle/","section":"Tags","summary":"","title":"Oracle"},{"content":"The client was a logistics company running Oracle 19c Enterprise Edition on Oracle Linux 8. Sixty concurrent users, a custom ERP application, about 400 GB of data. The server was a Dell PowerEdge with 128 GB of RAM and 32 cores.\nThe complaints were vague but persistent: \u0026ldquo;The system is slow.\u0026rdquo; \u0026ldquo;Morning queries take twice as long as two months ago.\u0026rdquo; \u0026ldquo;Every now and then everything freezes for a few seconds.\u0026rdquo;\nWhen I logged into the server, the first thing I checked was not the database. It was the operating system.\ncat /proc/meminfo | grep -i huge sysctl vm.nr_hugepages cat /sys/kernel/mm/transparent_hugepage/enabled Result: zero Huge Pages configured, Transparent Huge Pages active, kernel parameters all at default values. The Oracle installation had been done with the wizard, the operating system had never been touched.\nThere was the problem. It wasn\u0026rsquo;t Oracle. It was Linux that hadn\u0026rsquo;t been prepared for Oracle.\n🔍 The diagnosis #Before changing anything, I measured the current state. You need numbers, not impressions.\n# SGA status sqlplus -s / as sysdba \u0026lt;\u0026lt;SQL SELECT name, value/1024/1024 AS mb FROM v$sgainfo WHERE name IN (\u0026#39;Maximum SGA Size\u0026#39;, \u0026#39;Free SGA Memory Available\u0026#39;); SQL # System memory usage free -h # Current kernel parameters sysctl -a | grep -E \u0026#34;kernel.sem|kernel.shm|vm.nr_hugepages|vm.swappiness\u0026#34; # I/O scheduler in use cat /sys/block/sda/queue/scheduler # Oracle user limits su - oracle -c \u0026#34;ulimit -a\u0026#34; Here is what I found:\nParameter Current value Recommended value SGA Target 64 GB 64 GB (ok) vm.nr_hugepages 0 33280 Transparent Huge Pages always never vm.swappiness 60 1 kernel.shmmax 33554432 (32 MB) 68719476736 (64 GB) kernel.shmall 2097152 16777216 kernel.sem 250 32000 100 128 250 32000 100 256 I/O scheduler mq-deadline deadline (ok) oracle nofile 1024 65536 oracle nproc 4096 16384 oracle memlock 65536 KB unlimited Nearly everything was wrong. Not by mistake — by omission. Nobody had bothered to configure the operating system after installation.\n📦 Huge Pages: the parameter that changes everything #Huge Pages are the single most impactful parameter for Oracle on Linux. And they are also the one most often ignored.\nWhy they matter #By default, Linux manages memory in 4 KB pages. A 64 GB SGA means roughly 16.7 million pages. Each page has an entry in the Page Table, and the system must translate virtual addresses to physical ones for each. The CPU\u0026rsquo;s TLB (Translation Lookaside Buffer) can cache only a few thousand translations — the rest is handled by the MMU, which is slow.\nHuge Pages are 2 MB pages. The same 64 GB SGA becomes 32,768 pages. The TLB copes, MMU pressure drops, performance improves.\nHow to configure them #I calculated the number of Huge Pages needed:\n# SGA = 64 GB = 65536 MB # Each Huge Page = 2 MB # Pages needed = 65536 / 2 = 32768 # Adding 1.5% margin → 33280 echo \u0026#34;vm.nr_hugepages = 33280\u0026#34; \u0026gt;\u0026gt; /etc/sysctl.d/99-oracle.conf sysctl -p /etc/sysctl.d/99-oracle.conf Verification:\ngrep -i huge /proc/meminfo Expected output:\nHugePages_Total: 33280 HugePages_Free: 33280 HugePages_Rsvd: 0 Hugepagesize: 2048 kB After restarting the Oracle instance, the SGA gets allocated in Huge Pages:\nHugePages_Total: 33280 HugePages_Free: 512 HugePages_Rsvd: 480 The difference is measurable: latch free waits and library cache contention drop dramatically.\n🧱 Shared memory and semaphores #Oracle uses kernel shared memory for the SGA. If the limits are too low, the instance cannot allocate the requested memory — or worse, fragments the allocation.\ncat \u0026gt;\u0026gt; /etc/sysctl.d/99-oracle.conf \u0026lt;\u0026lt; \u0026#39;SYSCTL\u0026#39; # Shared memory kernel.shmmax = 68719476736 kernel.shmall = 16777216 kernel.shmmni = 4096 # Semaphores: SEMMSL SEMMNS SEMOPM SEMMNI kernel.sem = 250 32000 100 256 SYSCTL sysctl -p /etc/sysctl.d/99-oracle.conf Parameter Meaning Value shmmax Maximum size of a single shared memory segment 64 GB shmall Total pages of shared memory allocatable 64 GB in 4K pages shmmni Maximum number of shared memory segments 4096 sem SEMMSL, SEMMNS, SEMOPM, SEMMNI 250 32000 100 256 These are not magic numbers. They are sized for the database\u0026rsquo;s SGA. If the SGA changes, the parameters need recalculating.\n💾 I/O Scheduler #The default on RHEL/Oracle Linux 8 with NVMe devices is none or mq-deadline. For traditional SAS/SATA disks, the default may be bfq or cfq.\nFor Oracle, the recommendation is deadline (or mq-deadline on newer kernels):\n# Check current setting cat /sys/block/sda/queue/scheduler # If not deadline/mq-deadline, set it echo \u0026#34;deadline\u0026#34; \u0026gt; /sys/block/sda/queue/scheduler # Make it permanent via GRUB grubby --update-kernel=ALL --args=\u0026#34;elevator=deadline\u0026#34; cfq (Completely Fair Queuing) is designed for desktop workloads — it distributes I/O fairly across processes. But Oracle doesn\u0026rsquo;t need fairness: it needs I/O requests served in the order that minimises seeks. deadline does exactly that.\n🚫 Disabling Transparent Huge Pages #This is the most insidious parameter. Transparent Huge Pages (THP) is a kernel feature that sounds like a good idea: the kernel automatically promotes normal pages to huge pages.\nFor Oracle it is a disaster. The khugepaged process works in the background to compact pages, causing unpredictable latency spikes — those \u0026ldquo;freezes for a few seconds\u0026rdquo; the client had been complaining about.\nOracle says it explicitly in the documentation: disable THP.\n# Check current state cat /sys/kernel/mm/transparent_hugepage/enabled # Typical output: [always] madvise never # Disable at runtime echo never \u0026gt; /sys/kernel/mm/transparent_hugepage/enabled echo never \u0026gt; /sys/kernel/mm/transparent_hugepage/defrag # Make permanent via GRUB grubby --update-kernel=ALL --args=\u0026#34;transparent_hugepage=never\u0026#34; After reboot, verify:\ncat /sys/kernel/mm/transparent_hugepage/enabled # Expected output: always madvise [never] The difference is stark: random micro-freezes disappear.\n🔒 Security limits #The oracle user needs elevated limits on open file descriptors, processes and lockable memory. Linux defaults are designed for interactive users, not for software that manages hundreds of simultaneous connections.\ncat \u0026gt;\u0026gt; /etc/security/limits.d/99-oracle.conf \u0026lt;\u0026lt; \u0026#39;LIMITS\u0026#39; oracle soft nofile 65536 oracle hard nofile 65536 oracle soft nproc 16384 oracle hard nproc 16384 oracle soft stack 10240 oracle hard stack 32768 oracle soft memlock unlimited oracle hard memlock unlimited LIMITS Limit Default Recommended Why nofile 1024 65536 Oracle opens a file descriptor for every datafile, redo log, archive log nproc 4096 16384 Each Oracle process is a separate OS process memlock 65536 KB unlimited Required for locking the SGA into Huge Pages stack 8192 KB 10240-32768 KB Deep recursive PL/SQL can exhaust the stack The memlock unlimited setting is critical: without it, Oracle cannot lock the SGA into Huge Pages, making the earlier configuration pointless.\n⚡ Swappiness #The default value of vm.swappiness is 60. That means Linux starts swapping when memory pressure is still low. For a dedicated database server, this is unacceptable: you want the SGA to stay in RAM, always.\necho \u0026#34;vm.swappiness = 1\u0026#34; \u0026gt;\u0026gt; /etc/sysctl.d/99-oracle.conf sysctl -p /etc/sysctl.d/99-oracle.conf Not zero — one. A value of zero completely disables swap, which can trigger the OOM killer under extreme pressure. A value of one tells the kernel: \u0026ldquo;Only swap when there is truly no alternative.\u0026rdquo;\n📊 Before and after #After applying all configurations and restarting the Oracle instance, I ran the measurements again.\nMetric Before After Change SGA in Huge Pages No Yes — Library cache hit ratio 92.3% 99.7% +7.4% Buffer cache hit ratio 94.1% 99.2% +5.1% Average wait time (db file sequential read) 8.2 ms 1.4 ms -83% Random micro-freezes (\u0026gt;1s) 5-8 per day 0 -100% Average morning batch time 47 min 22 min -53% Average CPU utilisation 78% 41% -47% Swap used 3.2 GB 0 MB -100% The numbers speak for themselves. Same machine, same database, same workload. The only difference: the operating system was configured to do its job.\n📋 Final checklist #For those who want an operational summary, here is the complete checklist:\n# /etc/sysctl.d/99-oracle.conf vm.nr_hugepages = 33280 vm.swappiness = 1 kernel.shmmax = 68719476736 kernel.shmall = 16777216 kernel.shmmni = 4096 kernel.sem = 250 32000 100 256 # /etc/security/limits.d/99-oracle.conf oracle soft nofile 65536 oracle hard nofile 65536 oracle soft nproc 16384 oracle hard nproc 16384 oracle soft stack 10240 oracle hard stack 32768 oracle soft memlock unlimited oracle hard memlock unlimited # GRUB grubby --update-kernel=ALL --args=\u0026#34;transparent_hugepage=never elevator=deadline\u0026#34; Ten minutes of configuration. No hardware cost. No additional licences.\nBut nobody does it, because the wizard doesn\u0026rsquo;t ask, the documentation is buried in an MOS note, and the system \u0026ldquo;works without it.\u0026rdquo; It works. Poorly. And the blame always falls on Oracle, never on the fact that nobody prepared the ground.\nA database is only as good as the operating system it runs on. And an operating system left at defaults is an operating system working against you.\nGlossary #Huge Pages — 2 MB memory pages (instead of the standard 4 KB) that drastically reduce MMU and TLB pressure, improving Oracle performance on Linux.\nTHP — Transparent Huge Pages — Linux kernel feature that automatically promotes normal pages to huge pages, but causes unpredictable latencies and must be disabled for Oracle.\nSGA — System Global Area — Oracle Database\u0026rsquo;s shared memory area containing buffer cache, shared pool, redo log buffer and other structures critical for performance.\nI/O Scheduler — Linux kernel component that decides the order in which I/O requests are sent to disk, with direct impact on database performance.\nSwappiness — Linux kernel parameter (vm.swappiness) controlling the system\u0026rsquo;s propensity to move memory pages to swap, critical for database servers.\n","date":"24 February 2026","permalink":"https://ivanluminaria.com/en/posts/oracle/oracle-linux-kernel/","section":"Database Strategy","summary":"\u003cp\u003eThe client was a logistics company running Oracle 19c Enterprise Edition on Oracle Linux 8. Sixty concurrent users, a custom ERP application, about 400 GB of data. The server was a Dell PowerEdge with 128 GB of RAM and 32 cores.\u003c/p\u003e\n\u003cp\u003eThe complaints were vague but persistent: \u0026ldquo;The system is slow.\u0026rdquo; \u0026ldquo;Morning queries take twice as long as two months ago.\u0026rdquo; \u0026ldquo;Every now and then everything freezes for a few seconds.\u0026rdquo;\u003c/p\u003e","title":"Oracle on Linux: the kernel parameters nobody configures"},{"content":"","date":null,"permalink":"https://ivanluminaria.com/en/tags/productivity/","section":"Tags","summary":"","title":"Productivity"},{"content":"","date":null,"permalink":"https://ivanluminaria.com/en/tags/remote-work/","section":"Tags","summary":"","title":"Remote-Work"},{"content":"6:47 AM on an ordinary Tuesday. I\u0026rsquo;m at the park near my house, running gear on. The air is fresh, the sun is barely rising. I\u0026rsquo;ve already done four kilometers. I feel alive.\nBy 7:00 I\u0026rsquo;m in the shower. By 7:20 I\u0026rsquo;m having a calm breakfast. By 7:45 I\u0026rsquo;m at my desk — fresh, focused, ready to work.\nAt that same hour, a colleague of mine is still stuck on the Pontina highway. Or on Rome\u0026rsquo;s ring road, somewhere between the Casilina and Tuscolana exits. Phone in hand — not to work, but to send the usual message: \u0026ldquo;Sorry, running late, there\u0026rsquo;s been an accident.\u0026rdquo;\nTwo people. Same job. Same contract.\nOne is already productive. The other is burning his best energy in a car.\nThis isn\u0026rsquo;t an opinion. These are facts.\nAnd facts have numbers.\n🚗 The invisible cost of commuting in Rome #Let\u0026rsquo;s be honest: Rome is not a city. It\u0026rsquo;s a chaotic organism that moves in fits and starts.\nFor anyone working in IT consulting who lives outside the city center — and in Rome, \u0026ldquo;outside the center\u0026rdquo; means practically everywhere — the daily commute is an ordeal.\nLet\u0026rsquo;s run the numbers on a real scenario:\nItem Value Home-office distance ~30 km (a quarter of the ring road) Average time one way 1h 15min — 2h 30min Average return time 1h — 1h 45min Total daily time in car 2h 15min — 4h 15min Working days per month 21 Hours lost per month in car 47 — 89 hours Monthly fuel cost (~30km x 2 x 21 days) ~€250-300 Car wear, insurance, parking ~€150-200/month Total monthly commuting cost ~€400-500 Nearly 90 hours per month in the worst case. That\u0026rsquo;s more than two full working weeks spent in a car. Not working. Not thinking. Cursing at traffic.\nAnd I haven\u0026rsquo;t counted the stress. The frustration. The mental energy burned before you even turn on the computer.\n🏃 The other side of the coin: the morning of a remote worker #Here\u0026rsquo;s my typical day:\nTime Activity 6:00 Wake up 6:10 — 6:45 Run in the park (4-5 km) 6:50 — 7:10 Shower 7:10 — 7:30 Calm breakfast 7:30 — 7:45 Set up workstation, coffee, review agenda 7:45 Start working I arrive at my desk after exercising, after breathing fresh air, after having time to think. I don\u0026rsquo;t arrive after fighting a war.\nThe difference isn\u0026rsquo;t just physical. It\u0026rsquo;s cognitive.\nAn IT consultant works with their mind. They analyze systems, write code, design architectures, solve complex problems. If that mind arrives at the office already drained, already frustrated, already tired — how much is that workday really worth?\nI\u0026rsquo;ve worked with teams distributed across three time zones. I\u0026rsquo;ve managed critical databases connecting from home at 3 AM for an emergency. I\u0026rsquo;ve never needed an office to do my job. I\u0026rsquo;ve needed a stable connection, a quiet environment, and a clear mind.\n📊 The numbers for the company: what CFOs don\u0026rsquo;t want to see #IT consulting companies in Rome have a structural problem they pretend doesn\u0026rsquo;t exist.\nTake a company with 50 consultants. Let\u0026rsquo;s do the math:\nCost of corporate commuting # Item Calculation Annual total Hours lost in cars (avg 3h/day x 50 people) 150 hours/day x 220 days 33,000 hours/year Consultant hourly value (avg company cost) 33,000 x €35/hour ~€1,155,000/year Rome office rent (50 workstations) ~€800/workstation/month ~€480,000/year Utilities, cleaning, maintenance ~€60,000/year Estimated total cost ~€1,695,000/year One million seven hundred thousand euros. Every year. To keep fifty people sitting in the same place.\nSmart working scenario (80% remote) # Item Calculation Annual total Hours recovered (80% of 33,000) 26,400 hours converted to productive work Downsized office (15 hot-desk stations) ~€800 x 15 ~€144,000/year Employee connectivity contribution €50/month x 50 ~€30,000/year Home office equipment budget (one-time) €1,000 x 50 €50,000 (year 1) Estimated total cost (year 1) ~€224,000 Estimated total cost (from year 2) ~€174,000 Annual savings: over €1,400,000.\nAnd these are conservative estimates.\nBut the most important number isn\u0026rsquo;t the financial one.\nThe most important number is those 26,400 hours returned to productivity.\nHours where people work clear-headed, rested, focused.\nNot hours staring at a bumper on the Cristoforo Colombo highway.\n🧠 The argument nobody has the courage to make #I\u0026rsquo;ll say it clearly: office presenteeism in IT consulting is a cultural relic, not an operational necessity.\nAn IT consultant doesn\u0026rsquo;t work on an assembly line. They don\u0026rsquo;t need to be physically present next to a machine. They need:\na fast internet connection a quiet environment adequate digital tools clear communication with the team measurable objectives All of this works better from home than in a noisy open space where the phone rings every five minutes and someone interrupts you to ask \u0026ldquo;got a minute?\u0026rdquo; (which is never actually a minute).\nThe real problem is control. Some companies don\u0026rsquo;t know how to manage work by objectives. They only know how to manage attendance. And they confuse the two.\nIf a consultant closes 20 tickets in a week working from home in shorts, they\u0026rsquo;re more productive than one who closes 8 wearing a suit in the office from 9 to 6.\nNumbers don\u0026rsquo;t lie. Office chairs do.\n🏢 \u0026ldquo;But what about company culture? Team spirit?\u0026rdquo; #I hear this often. I understand it. I don\u0026rsquo;t agree with it, but I understand it.\nCompany culture isn\u0026rsquo;t built by seating people close together. It\u0026rsquo;s built through:\nshared objectives that everyone understands transparent communication that nobody endures meaningful meetings — not the weekly catch-up where everyone stares at their phone under the table mutual trust — which is exactly what\u0026rsquo;s missing when you enforce attendance A team that works well remotely is a team that has learned to truly communicate. Not through physical proximity, but through clarity.\nI\u0026rsquo;ve seen office teams that didn\u0026rsquo;t talk to each other. And distributed teams across three countries that ran like Swiss watches.\nThe difference isn\u0026rsquo;t the place. It\u0026rsquo;s the method.\n🎯 A concrete proposal #If you run an IT consulting firm in Rome — or any major city with mobility problems — here\u0026rsquo;s what I\u0026rsquo;d suggest:\n1. Adopt an 80/20 model\n80% remote, 20% in person. Office days are for workshops, project reviews, real team building — not chair warming.\n2. Invest in the home workstation\n€1,000 one-time per employee: monitor, ergonomic chair, headset with microphone. It\u0026rsquo;s an investment that pays for itself in two weeks of saved rent.\n3. Measure results, not hours\nDefine clear KPIs: tickets closed, code shipped, SLAs met, clients satisfied. Those who produce, produce — regardless of where they are.\n4. Downsize physical spaces\nGo from 50 fixed desks to 15 hot desks. Use the saved space for a proper meeting room and a real break area.\n5. Trust your people\nIf you hired professionals, treat them as professionals. If you don\u0026rsquo;t trust them without seeing them, the problem isn\u0026rsquo;t smart working. It\u0026rsquo;s your hiring process.\n💬 To those who recognize themselves in this story #If every morning you wake up an hour earlier than necessary to \u0026ldquo;beat the traffic\u0026rdquo; — and still arrive late.\nIf you spend €500 a month for the privilege of sitting in a queue.\nIf you get to your desk already tired, already stressed, already with a compromised day.\nKnow that it doesn\u0026rsquo;t have to be this way.\nThere\u0026rsquo;s a different way to work. Smarter. More human. More productive.\nAnd the numbers prove it.\nNo revolutions needed. Just the courage to look at those numbers.\nAnd to act accordingly.\nMeanwhile, tomorrow morning I\u0026rsquo;ll wake up at 6, go for a run, and by 7:45 I\u0026rsquo;ll be operational.\nWith a smile. No traffic. No stress.\nAnd my mind already on the first problem to solve.\nGlossary #Smart Working — Flexible work model combining remote work and office presence, based on measurable objectives instead of schedules and physical presence.\nCommuting — Daily home-to-work travel and back, which in large cities can absorb 2-4 hours per day and hundreds of euros per month in direct costs.\nPresenteeism — Organizational culture that equates physical office presence with productivity, regardless of results actually produced.\nKPI — Key Performance Indicator — measurable metric that evaluates the effectiveness of an activity against a defined objective, used to measure concrete results instead of hours of presence.\nHot Desk — Office space model where workstations are unassigned: whoever comes to the office takes an available desk.\n","date":"24 February 2026","permalink":"https://ivanluminaria.com/en/posts/project-management/smartworking-consulenza-it/","section":"Database Strategy","summary":"\u003cp\u003e6:47 AM on an ordinary Tuesday. I\u0026rsquo;m at the park near my house, running gear on. The air is fresh, the sun is barely rising. I\u0026rsquo;ve already done four kilometers. I feel alive.\u003c/p\u003e\n\u003cp\u003eBy 7:00 I\u0026rsquo;m in the shower. By 7:20 I\u0026rsquo;m having a calm breakfast. By 7:45 I\u0026rsquo;m at my desk — fresh, focused, ready to work.\u003c/p\u003e\n\u003cp\u003eAt that same hour, a colleague of mine is still stuck on the Pontina highway. Or on Rome\u0026rsquo;s ring road, somewhere between the Casilina and Tuscolana exits. Phone in hand — not to work, but to send the usual message: \u0026ldquo;Sorry, running late, there\u0026rsquo;s been an accident.\u0026rdquo;\u003c/p\u003e","title":"Smart working in IT consulting: the numbers nobody wants to look at"},{"content":"","date":null,"permalink":"https://ivanluminaria.com/en/tags/tuning/","section":"Tags","summary":"","title":"Tuning"},{"content":"For nearly thirty years, I have not changed profession.\nI have changed depth.\nEvery role I have taken on has not been a lateral move. It has been a vertical deepening.\nWhat I offer today is not a list of skills. It is the sum of the consequences those skills generate.\nData Warehouse Architect #My Data Warehouse experience has developed in environments where data is critical infrastructure.\nI have worked for major Telco operators such as TIM, Wind, Vodafone and 3, on international projects with Huawei, in the insurance sector with Generali, in institutional environments with the Bank of Italy, the Italian Foreign Exchange Office and Cassa Depositi e Prestiti, in the pharmaceutical industry, and in the automotive sector with Rover and Alfa Romeo.\nDifferent industries. One common requirement: reliability.\nIn Telco, scale matters. In banking and public institutions, regulatory precision matters. In insurance, risk consistency matters. In pharma, compliance matters. In automotive, process synchronization and data quality across the supply chain matter.\nI have designed data architectures in environments where error is not an option.\n👉 Read the roadmap of my skills and experience as a Data Warehouse Architect | Download PDF\nProject Manager with Technical Background #My project management experience matured while coordinating initiatives in complex environments:\nInternational telecommunications, Central financial institutions, Insurance, Pharma, Automotive and Public Administration.\nIn these sectors, mistakes have a real cost.\nA technical Project Manager understands the impact of decisions. Knows when a compromise is acceptable. And when it is not.\nI connect strategic vision with technical detail. Roadmaps with implementation. Decisions with consequences.\n👉 Read the roadmap of my skills and experience as a Project Manager | Download PDF\nOracle DBA \u0026amp; Performance Tuning Expert #My activity as an Oracle DBA has been consolidated in mission-critical environments for operators such as TIM, Wind, Vodafone and 3, in complex technological contexts like Huawei, for financial institutions such as the Bank of Italy, the Italian Foreign Exchange Office and Cassa Depositi e Prestiti, for insurance organizations like Generali, and in industrial projects within the automotive sector.\nHigh availability systems. Heavy workloads. Real on-call responsibility.\nHere, tuning is not cosmetic improvement. It is operational protection.\n👉 Read the roadmap of my skills and experience as an Oracle DBA | Download PDF\nOracle PL/SQL – Senior \u0026amp; Mentor #I have developed PL/SQL across Telco, Banking, Insurance, Automotive, Pharma and Public Administration environments.\nIn contexts such as Generali or the Bank of Italy, database logic is part of system stability. In automotive, data consistency directly impacts production processes and the supply chain.\nFor years, I have taught SQL and PL/SQL, contributing to the technical growth of developers who now work on complex systems.\nToday, I do not simply write code. I can lead. I can mentor. I can help developers grow in both technical depth and architectural thinking.\n👉 Read the roadmap of my skills and experience as an Oracle PL/SQL Developer | Download PDF\n","date":null,"permalink":"https://ivanluminaria.com/en/resumes/","section":"Know-How \u0026 Impact","summary":"\u003cp\u003eFor nearly thirty years, I have not changed profession.\u003c/p\u003e\n\u003cp\u003eI have changed depth.\u003c/p\u003e\n\u003cp\u003eEvery role I have taken on has not been a lateral move.\nIt has been a vertical deepening.\u003c/p\u003e\n\u003cp\u003eWhat I offer today is not a list of skills.\nIt is the sum of the consequences those skills generate.\u003c/p\u003e\n\u003chr\u003e\n\u003ch2 id=\"data-warehouse-architect\" class=\"relative group\"\u003eData Warehouse Architect \u003cspan class=\"absolute top-0 w-6 transition-opacity opacity-0 -start-6 not-prose group-hover:opacity-100\"\u003e\u003ca class=\"group-hover:text-primary-300 dark:group-hover:text-neutral-700\" style=\"text-decoration-line: none !important;\" href=\"#data-warehouse-architect\" aria-label=\"Anchor\"\u003e#\u003c/a\u003e\u003c/span\u003e\u003c/h2\u003e\u003cp\u003eMy Data Warehouse experience has developed in environments where data is critical infrastructure.\u003c/p\u003e","title":"Know-How \u0026 Impact"},{"content":" IVAN LUMINARIA Oracle, PostgreSQL \u0026 MySQL Expert | DWH Architect | Project Manager IT professional with approximately 30 years of experience in designing, implementing and managing database and Data Warehouse solutions across Oracle, PostgreSQL and MySQL environments. LinkedIn: ivanluminaria\nEmail: ivan (dot) luminaria (at) gmail (dot) com I have been working with databases for about thirty years.\nLong enough to have seen engines, languages, trends and buzzwords change.\nLong enough to know that beneath the surface, the rules never really do.\nA database is not a container.\nIt is an organism.\nIt breathes.\nIt gets tired.\nIt locks.\nOr it scales.\nI started in Oracle mission-critical environments, when “mission-critical” wasn’t a slide-friendly expression.\nIt meant the difference between a system holding up — or collapsing at three in the morning.\nI’ve spent countless hours on AWR, ASH, execution plans.\nI’ve seen systems slow down “for no apparent reason.”\nAnd I learned that there is always a reason.\nYou just have to look for it methodically.\nThen came PostgreSQL and MySQL.\nDifferent tools.\nDifferent philosophies.\nSame discipline.\nUnderstand the engine.\nDon’t fight it.\nI don’t believe in magical tuning.\nI believe in up-to-date statistics.\nIn sound data modeling.\nIn the difference between “it works” and “it holds.”\nBecause a slow database is not a technical issue.\nIt’s a business issue.\nIt’s a report that doesn’t arrive.\nA customer who waits.\nA decision made too late.\nAnd that’s exactly where I like to be —\nwhere technology meets real impact.\nHow I Work #I don’t just “administer” databases.\nI observe them.\nI measure them.\nI stress them.\nI secure them.\nI focus on performance tuning in complex environments — Oracle, RAC, Exadata — as well as PostgreSQL and MySQL in modern, often open-source contexts, where the lack of proprietary “magic” forces you to truly understand what happens under the hood.\nI design Data Warehouse architectures because data is not meant to be stored.\nIt is meant to be understood.\nI’ve worked with multidimensional models, ETL/ELT processes and data flows that must be reliable before they are fast.\nBecause incorrect data, even delivered in milliseconds, is still incorrect.\nI write PL/SQL when needed.\nI optimize when required.\nI refactor when inevitable.\nI’m not interested in flashy effects.\nI’m interested in solidity.\nVision #Over the years, I have combined deep technical expertise with broader perspective.\nI have coordinated small international teams.\nTranslated business requirements into sustainable technical decisions.\nLearned that complexity cannot be eliminated.\nIt must be governed.\nAgile, Scrum, structured processes — useful tools.\nBut without real competence, they remain labels.\nFor me, leadership means something simple:\nmaking technical decisions that stand the test of time.\nBeyond the Database #Outside of work, I cultivate passions that, in many ways, speak the same language.\nPhotography taught me that the right light changes the story.\nMusic — I’m learning the saxophone (with patience and humility) and playing guitar — reminds me that technique without sensitivity is just noise.\nCooking is edible architecture: balance, timing, proportion.\nChess is pure strategy: every move is a choice, every choice has consequences.\nPerhaps that’s why I feel comfortable in complex systems.\nThey don’t intimidate me.\nThey intrigue me.\nI like going deep into details.\nBut only to make the whole work better.\nI don’t build databases.\nI build solidity.\n\u0026ldquo;I transform data complexity into strategic business value.\u0026rdquo;\n","date":"20 February 2026","permalink":"https://ivanluminaria.com/en/about/","section":"Ivan Luminaria","summary":"\u003cdiv class=\"profile-header flex flex-col sm:flex-row sm:items-center items-start\"\u003e\n  \u003cdiv class=\"flex-none\"\u003e\n    \u003cimg\n      src='https://ivanluminaria.com/img/ivan_luminaria_avatar.png'\n      alt=\"Ivan Luminaria\"\n      class=\"profile-avatar rounded-full shadow-xl border-4 border-slate-100\"\u003e\n  \u003c/div\u003e\n  \u003cdiv class=\"profile-text text-left\"\u003e\n    \u003ch1 class=\"mb-1\"\u003eIVAN LUMINARIA\u003c/h1\u003e\n    \u003ch3 class=\"mt-0 text-slate-500\"\u003e\n      Oracle, PostgreSQL \u0026 MySQL Expert | DWH Architect | Project Manager\n    \u003c/h3\u003e\n    \u003cp class=\"mt-4\"\u003e\n      IT professional with approximately \u003cstrong\u003e30 years of experience\u003c/strong\u003e in designing, implementing and managing database and Data Warehouse solutions across Oracle, PostgreSQL and MySQL environments.\n    \u003c/p\u003e\n    \u003cp class=\"mt-2\" style=\"font-size: 0.9em;\"\u003e\n      \u003cstrong\u003eLinkedIn:\u003c/strong\u003e \u003ca href=\"https://www.linkedin.com/in/ivanluminaria/\" target=\"_blank\" rel=\"noopener\"\u003eivanluminaria\u003c/a\u003e\u003cbr\u003e\n      \u003cstrong\u003eEmail:\u003c/strong\u003e ivan (dot) luminaria (at) gmail (dot) com\n    \u003c/p\u003e\n  \u003c/div\u003e\n\u003c/div\u003e\n\u003chr\u003e\n\u003cp\u003eI have been working with databases for about thirty years.\u003c/p\u003e","title":"About me"},{"content":"","date":null,"permalink":"https://ivanluminaria.com/en/tags/cluster/","section":"Tags","summary":"","title":"Cluster"},{"content":"","date":null,"permalink":"https://ivanluminaria.com/en/tags/galera/","section":"Tags","summary":"","title":"Galera"},{"content":"The ticket was laconic, as it often is when the problem is serious: \u0026ldquo;The database went down again. The application is stopped. Third time in two months.\u0026rdquo;\nThe client had a MariaDB on a single Linux server — a business management application used by about two hundred internal users, with load spikes during end-of-month accounting closures. Every time the server had a problem — a disk slowing down, a system update requiring a reboot, a process consuming all the RAM — the database crashed and with it the entire business operations.\nThe question wasn\u0026rsquo;t \u0026ldquo;how do we fix the server\u0026rdquo;. The question was: how do we make sure that the next time a server has a problem, the application keeps running?\nThe answer, after twenty years of experience with this type of scenario, was one: Galera Cluster.\nThe diagnosis: a classic single point of failure #The first thing I did was analyse the infrastructure. The picture was familiar:\nA single MariaDB server, no replica Nightly backup to external disk (at least that) No failover mechanism The application pointed directly to the database server\u0026rsquo;s IP Every downtime, even ten minutes, meant two hundred people idle. During accounting closures, it meant delays cascading across business processes.\nI proposed a solution based on Galera Cluster: three MariaDB nodes with synchronous multi-master replication. Any node accepts reads and writes, data is consistent across all three, and if one node goes down the other two continue serving the application without interruption.\nThe client already had three Linux VMs available — the infrastructure team had provisioned them for another project that was later postponed. Perfect: no need to even order hardware.\nThe plan: three nodes, zero single point of failure #The available machines:\nNode Hostname IP Node 1 db-node1 10.0.1.11 Node 2 db-node2 10.0.1.12 Node 3 db-node3 10.0.1.13 An important premise: Galera is not a native option in MySQL Community. You either use MariaDB (which integrates it natively) or Percona XtraDB Cluster (based on MySQL). The client was already using MariaDB, so the choice was natural and required no engine migration.\nThe goal was clear: migrate from a single-node architecture to a three-node cluster, without changing the application beyond the connection address.\nInstallation: same version on all nodes #First non-negotiable principle: all nodes must have the exact same version of MariaDB. I\u0026rsquo;ve seen clusters unstable for months because someone had updated one node and not the others.\nOn all three servers:\n# Add the official MariaDB repository (example for 11.4 LTS) curl -LsS https://downloads.mariadb.com/MariaDB/mariadb_repo_setup | \\ sudo bash -s -- --mariadb-server-version=\u0026#34;mariadb-11.4\u0026#34; # Install MariaDB Server and the Galera plugin sudo dnf install MariaDB-server MariaDB-client galera-4 -y # Enable but DO NOT start the service yet sudo systemctl enable mariadb Don\u0026rsquo;t start the service. Configure first. Always.\nThe heart of configuration: /etc/my.cnf.d/galera.cnf #This file defines the cluster\u0026rsquo;s behaviour. It must be created on every node, with the appropriate differences for IP address and node name.\nHere\u0026rsquo;s the complete configuration for Node 1:\n[mysqld] # === Engine and charset === binlog_format=ROW default_storage_engine=InnoDB innodb_autoinc_lock_mode=2 innodb_flush_log_at_trx_commit=2 innodb_buffer_pool_size=1G # === WSREP (Galera) configuration === wsrep_on=ON wsrep_provider=/usr/lib64/galera-4/libgalera_smm.so # List of ALL cluster nodes wsrep_cluster_address=\u0026#34;gcomm://10.0.1.11,10.0.1.12,10.0.1.13\u0026#34; # Cluster name (must be identical on all nodes) wsrep_cluster_name=\u0026#34;galera_production\u0026#34; # THIS node\u0026#39;s identity (changes on each server) wsrep_node_address=\u0026#34;10.0.1.11\u0026#34; wsrep_node_name=\u0026#34;db-node1\u0026#34; # SST method (State Snapshot Transfer) wsrep_sst_method=mariabackup wsrep_sst_auth=\u0026#34;sst_user:secure_password_here\u0026#34; # === Network === bind-address=0.0.0.0 For Node 2 and Node 3, the only things that change are:\n# Node 2 wsrep_node_address=\u0026#34;10.0.1.12\u0026#34; wsrep_node_name=\u0026#34;db-node2\u0026#34; # Node 3 wsrep_node_address=\u0026#34;10.0.1.13\u0026#34; wsrep_node_name=\u0026#34;db-node3\u0026#34; Everything else is identical. Identical. Don\u0026rsquo;t give in to the temptation to \u0026ldquo;customise\u0026rdquo; buffer pool or other parameters per node: in a Galera cluster, symmetry is a virtue.\nWhy every parameter matters #Let\u0026rsquo;s go through the parameters one by one, because each has a precise reason.\nbinlog_format=ROW #Galera requires ROW format for the binary log. Not STATEMENT, not MIXED. ROW only. With other formats the cluster won\u0026rsquo;t even start — and rightly so, because synchronous replication based on certification depends on row-level precision.\ninnodb_autoinc_lock_mode=2 #This sets the lock mode for auto-increment to \u0026ldquo;interleaved\u0026rdquo;. In a multi-master cluster, two nodes can generate INSERTs simultaneously on the same table. With lock mode 1 (the default) this would create deadlocks. With value 2, InnoDB generates auto-increments without a global lock, allowing concurrent inserts from different nodes.\nThe consequence: auto-increment IDs won\u0026rsquo;t be sequential across nodes. If your application depends on sequential IDs, you have an architectural problem to solve upstream.\ninnodb_flush_log_at_trx_commit=2 #Here we make a conscious trade-off. Value 1 (default) guarantees total durability — every commit is written and fsynced to disk. But in a Galera cluster, durability is already guaranteed by synchronous replication across three nodes. Value 2 writes to the OS buffer on each commit and fsyncs only every second, improving write performance by 30-40% in our tests.\nIf you lose one node, the data is on the other two. If you lose the entire datacenter\u0026hellip; well, that\u0026rsquo;s another conversation.\nwsrep_sst_method=mariabackup #SST is the mechanism by which a node joining the cluster receives a complete copy of the data. The options are:\nMethod Pro Con rsync Fast Donor node blocks on reads mariabackup Doesn\u0026rsquo;t block the donor Requires separate installation mysqldump Simple Very slow, blocks the donor Always mariabackup. In production, blocking a donor node during an SST means degrading the cluster at the moment you need it most.\n# Install mariabackup on all nodes sudo dnf install MariaDB-backup -y Firewall: the ports Galera needs open #This is where I see 50% of first installations fail. Galera doesn\u0026rsquo;t just use the MySQL port.\n# On all three nodes sudo firewall-cmd --permanent --add-port=3306/tcp # Standard MySQL sudo firewall-cmd --permanent --add-port=4567/tcp # Galera cluster communication sudo firewall-cmd --permanent --add-port=4567/udp # Multicast replication (optional) sudo firewall-cmd --permanent --add-port=4568/tcp # IST (Incremental State Transfer) sudo firewall-cmd --permanent --add-port=4444/tcp # SST (State Snapshot Transfer) sudo firewall-cmd --reload If SELinux is active (and it should be), you also need the policies:\nsudo setsebool -P mysql_connect_any 1 Four ports. Four. Not one more, not one less. If you forget one, the cluster forms but doesn\u0026rsquo;t synchronise — and debugging becomes an exercise in frustration.\nData migration and the bootstrap #Before starting the cluster, I migrated the data from the standalone server to Node 1 with a full dump:\n# On the old standalone server mysqldump --all-databases --single-transaction --routines --triggers \\ --events \u0026gt; /tmp/full_dump.sql # Transfer to Node 1 scp /tmp/full_dump.sql db-node1:/tmp/ Then the bootstrap — the moment of truth. The first node doesn\u0026rsquo;t start with systemctl start mariadb. It starts with the bootstrap command.\nOnly on Node 1:\nsudo galera_new_cluster This command starts MariaDB with wsrep_cluster_address=gcomm:// (empty), which means: \u0026ldquo;I am the founder, I\u0026rsquo;m not looking for other nodes.\u0026rdquo;\nData import and SST user creation:\n-- Import the dump from the old server SOURCE /tmp/full_dump.sql; -- Create the user for data transfer between nodes CREATE USER \u0026#39;sst_user\u0026#39;@\u0026#39;localhost\u0026#39; IDENTIFIED BY \u0026#39;secure_password_here\u0026#39;; GRANT RELOAD, PROCESS, LOCK TABLES, REPLICATION CLIENT ON *.* TO \u0026#39;sst_user\u0026#39;@\u0026#39;localhost\u0026#39;; FLUSH PRIVILEGES; Immediate verification:\nSHOW STATUS LIKE \u0026#39;wsrep_cluster_size\u0026#39;; -- +--------------------+-------+ -- | Variable_name | Value | -- +--------------------+-------+ -- | wsrep_cluster_size | 1 | -- +--------------------+-------+ If the value is 1, the bootstrap worked. Now, on the other two nodes:\nsudo systemctl start mariadb These nodes read wsrep_cluster_address, find Node 1, receive a full SST with all the data and join the cluster.\nAfter starting all three:\nSHOW STATUS LIKE \u0026#39;wsrep_cluster_size\u0026#39;; -- +--------------------+-------+ -- | Variable_name | Value | -- +--------------------+-------+ -- | wsrep_cluster_size | 3 | -- +--------------------+-------+ Three. That\u0026rsquo;s the magic number. The moment the client stopped having a single point of failure.\nChecking cluster health #This is the part that truly matters for whoever manages the cluster day to day. It\u0026rsquo;s not enough to know that wsrep_cluster_size is 3. You need to read the full status.\nThe diagnostic query I always use #SHOW STATUS WHERE Variable_name IN ( \u0026#39;wsrep_cluster_size\u0026#39;, \u0026#39;wsrep_cluster_status\u0026#39;, \u0026#39;wsrep_connected\u0026#39;, \u0026#39;wsrep_ready\u0026#39;, \u0026#39;wsrep_local_state_comment\u0026#39;, \u0026#39;wsrep_incoming_addresses\u0026#39;, \u0026#39;wsrep_evs_state\u0026#39;, \u0026#39;wsrep_flow_control_paused\u0026#39;, \u0026#39;wsrep_local_recv_queue_avg\u0026#39;, \u0026#39;wsrep_local_send_queue_avg\u0026#39;, \u0026#39;wsrep_cert_deps_distance\u0026#39; ); How to interpret the results # Variable Healthy value Meaning wsrep_cluster_size 3 All nodes are in the cluster wsrep_cluster_status Primary The cluster is operational and has quorum wsrep_connected ON This node is connected to the cluster wsrep_ready ON This node accepts queries wsrep_local_state_comment Synced This node is synchronised wsrep_flow_control_paused 0.0 No flow control pauses wsrep_local_recv_queue_avg \u0026lt; 0.5 Receive queue is under control wsrep_local_send_queue_avg \u0026lt; 0.5 Send queue is under control Warning signs #wsrep_cluster_status = Non-Primary: the node has lost quorum. It\u0026rsquo;s isolated. It won\u0026rsquo;t accept writes (and shouldn\u0026rsquo;t). This happens when a node loses connection with the majority of the cluster.\nwsrep_flow_control_paused \u0026gt; 0.0: flow control activated. It means a node is too slow applying transactions and is asking the others to slow down. A value close to 1.0 means the cluster is essentially stalled, waiting for the slowest node.\nwsrep_local_recv_queue_avg \u0026gt; 1.0: incoming transactions are piling up. Could be a disk I/O problem, CPU, or an undersized node.\nMonitoring script #I also delivered a script for their monitoring system (Zabbix, in their case):\n#!/bin/bash # galera_health_check.sh — run on every node MYSQL=\u0026#34;mysql -u monitor -p\u0026#39;monitor_pwd\u0026#39; -Bse\u0026#34; CLUSTER_SIZE=$($MYSQL \u0026#34;SHOW STATUS LIKE \u0026#39;wsrep_cluster_size\u0026#39;\u0026#34; | awk \u0026#39;{print $2}\u0026#39;) CLUSTER_STATUS=$($MYSQL \u0026#34;SHOW STATUS LIKE \u0026#39;wsrep_cluster_status\u0026#39;\u0026#34; | awk \u0026#39;{print $2}\u0026#39;) NODE_STATE=$($MYSQL \u0026#34;SHOW STATUS LIKE \u0026#39;wsrep_local_state_comment\u0026#39;\u0026#34; | awk \u0026#39;{print $2}\u0026#39;) FLOW_CONTROL=$($MYSQL \u0026#34;SHOW STATUS LIKE \u0026#39;wsrep_flow_control_paused\u0026#39;\u0026#34; | awk \u0026#39;{print $2}\u0026#39;) if [ \u0026#34;$CLUSTER_SIZE\u0026#34; -lt 3 ] || [ \u0026#34;$CLUSTER_STATUS\u0026#34; != \u0026#34;Primary\u0026#34; ] || [ \u0026#34;$NODE_STATE\u0026#34; != \u0026#34;Synced\u0026#34; ]; then echo \u0026#34;CRITICAL: Galera cluster degraded\u0026#34; echo \u0026#34; Size: $CLUSTER_SIZE | Status: $CLUSTER_STATUS | State: $NODE_STATE | FC: $FLOW_CONTROL\u0026#34; exit 2 fi echo \u0026#34;OK: Galera cluster healthy (3 nodes, Primary, Synced)\u0026#34; exit 0 The split-brain problem: why three nodes and not two #When I presented the solution to the client, the first question was: \u0026ldquo;Do we really need three servers? Wouldn\u0026rsquo;t two be enough?\u0026rdquo;\nNo. And it\u0026rsquo;s not a cost issue — it\u0026rsquo;s a matter of mathematics.\nGalera uses a consensus algorithm based on **quorum** . With three nodes, the quorum is 2: if one node fails, the other two recognise they are the majority and continue operating. With two nodes, the quorum is 2: if one node fails, the remaining one doesn\u0026rsquo;t have quorum and blocks to prevent split-brain.\nThe parameter pc.ignore_quorum exists to force a node to operate without quorum, but that\u0026rsquo;s like disabling the fire alarm because it rings too often.\nThree nodes is the minimum for a production Galera cluster. The third node isn\u0026rsquo;t a luxury — it\u0026rsquo;s what allows the cluster to keep running when things go wrong.\nWhen a node goes down and comes back #One of the first things I did after going to production was simulate a failure — with the client watching.\nI shut down Node 3. The application kept running without interruption on nodes 1 and 2. No errors, no timeouts. Two hundred users who noticed nothing.\nThen I restarted Node 3. What happened:\nThe node started and contacted the others via wsrep_cluster_address The transaction gap was small, so it received an **IST** (Incremental State Transfer) — only the missing transactions In less than a minute it was Synced again If the node had stayed down longer and the gcache had been exceeded, it would have received a full SST — the entire dataset. That\u0026rsquo;s why the gcache.size parameter matters:\nwsrep_provider_options=\u0026#34;gcache.size=512M\u0026#34; A larger gcache means the cluster can tolerate longer node downtime without requiring a full SST. In the client\u0026rsquo;s case, with about 80-100 MB of transactions per day, a 512 MB gcache covered nearly a week of absence.\nThe client watched Node 3 come back in sync and said: \u0026ldquo;So the next time we need to do maintenance on a server, we don\u0026rsquo;t have to stop everything?\u0026rdquo; Exactly. That was the point.\nProduction readiness checklist #Before declaring the cluster ready to the client, I verified every point:\nSame MariaDB version on all nodes wsrep_cluster_size = 3 wsrep_cluster_status = Primary on all nodes wsrep_local_state_comment = Synced on all nodes Write test on Node 1, read verification on Node 2 and 3 Shutdown test on one node: the cluster keeps running Rejoin test: the node returns to Synced without a full SST SST user configured and working Firewall verified on all ports (3306, 4567, 4568, 4444) Monitoring active on wsrep_cluster_status and wsrep_flow_control_paused Backup configured (on ONE node, not all three) Application reconfigured to point to the load balancer or VIP Six months later #I heard from the client six months after going to production. In the meantime they had two scheduled reboots for system updates and one unexpected disk failure on one of the nodes. In all three cases, the application never stopped working. Zero unplanned downtime.\nWhat struck me most was his comment: \u0026ldquo;We used to live with the anxiety of the database going down. Now we don\u0026rsquo;t think about it anymore.\u0026rdquo;\nThat\u0026rsquo;s the real value of a well-configured Galera cluster. It\u0026rsquo;s not the technology itself — it\u0026rsquo;s the peace of mind it brings. The certainty that a single failure no longer stops the business.\nThe technical part is the easiest. What makes the difference is understanding why each parameter is set a certain way, what happens when things go wrong, and how to diagnose problems before they become emergencies. A cluster that works in a demo and one that holds in production: the distance between the two is all in the details I\u0026rsquo;ve described here.\nGlossary #Quorum — Majority-based consensus mechanism. With 3 nodes the quorum is 2: if one fails, the other two continue operating. This is what prevents split-brain.\nSST — State Snapshot Transfer: mechanism by which a node joining the cluster receives a complete copy of the entire dataset from a donor node. The recommended method is mariabackup.\nIST — Incremental State Transfer: transfer of only the missing transactions to a node rejoining the cluster after a brief absence. Much faster than a full SST.\nWSREP — Write Set Replication: synchronous replication API and protocol used by Galera Cluster. Each transaction is replicated to all nodes before commit through a certification process.\nSplit-brain — Critical condition where two parts of the cluster operate independently accepting divergent writes. Quorum prevents it: only the partition with the majority of nodes can continue operating.\n","date":"17 February 2026","permalink":"https://ivanluminaria.com/en/posts/mysql/galera-cluster-3-nodi/","section":"Database Strategy","summary":"\u003cp\u003eThe ticket was laconic, as it often is when the problem is serious: \u0026ldquo;The database went down again. The application is stopped. Third time in two months.\u0026rdquo;\u003c/p\u003e\n\u003cp\u003eThe client had a MariaDB on a single Linux server — a business management application used by about two hundred internal users, with load spikes during end-of-month accounting closures. Every time the server had a problem — a disk slowing down, a system update requiring a reboot, a process consuming all the RAM — the database crashed and with it the entire business operations.\u003c/p\u003e","title":"Galera Cluster with 3 nodes: how I solved a MySQL availability problem"},{"content":"","date":null,"permalink":"https://ivanluminaria.com/en/tags/high-availability/","section":"Tags","summary":"","title":"High-Availability"},{"content":"","date":null,"permalink":"https://ivanluminaria.com/en/tags/wsrep/","section":"Tags","summary":"","title":"Wsrep"},{"content":"","date":null,"permalink":"https://ivanluminaria.com/en/tags/ash/","section":"Tags","summary":"","title":"Ash"},{"content":"","date":null,"permalink":"https://ivanluminaria.com/en/tags/awr/","section":"Tags","summary":"","title":"Awr"},{"content":"Friday, 6:40 PM. I already had my jacket on, ready to leave. The phone buzzes. It\u0026rsquo;s the project manager.\n\u0026ldquo;Ivan, we have a problem. The system is crawling. The go-live is tomorrow morning.\u0026rdquo;\nIt\u0026rsquo;s not the first time I\u0026rsquo;ve received a call like that. But the tone was different. This wasn\u0026rsquo;t the usual vague complaint about slowness. This was panic.\nI reconnect to the VPN, open a session on the client\u0026rsquo;s Oracle 19c database. First thing I do is a quick check:\nSELECT metric_name, value FROM v$sysmetric WHERE metric_name IN (\u0026#39;Database CPU Time Ratio\u0026#39;, \u0026#39;Database Wait Time Ratio\u0026#39;, \u0026#39;Average Active Sessions\u0026#39;); CPU Time Ratio: 12%. Under normal conditions it was above 80%.\nAverage Active Sessions: 47. On a server with 16 cores.\nForty-seven active sessions. The database was drowning.\n🔥 The symptoms #The development team had completed the last application code deploy that afternoon. Everything seemed to work on the test environment. But when they launched the pre-go-live verification batch — the one that simulates production load — response times exploded.\nQueries that normally ran in 2-3 seconds were taking 45. Batches that finished in 20 minutes were still running after an hour. The dominant wait events were db file sequential read and db file scattered read — unmistakable signs of massive physical I/O.\nSomething was reading enormous amounts of data from disk. Something that wasn\u0026rsquo;t there before.\n📊 AWR: the big picture #AWR — Automatic Workload Repository — is the most powerful diagnostic tool Oracle provides. Every hour, Oracle takes a snapshot of performance statistics and stores it in the internal repository. By comparing two snapshots, you get a report that tells you exactly what happened during that period.\nI generated a manual snapshot to capture the current situation:\nEXEC DBMS_WORKLOAD_REPOSITORY.create_snapshot; Then I looked for available snapshots:\nSELECT snap_id, begin_interval_time, end_interval_time FROM dba_hist_snapshot WHERE begin_interval_time \u0026gt; SYSDATE - 1/6 ORDER BY snap_id DESC; I had a snapshot from 6:00 PM (before the visible problem) and the one I had just created at 6:45 PM. I generated the AWR report:\nSELECT output FROM TABLE(DBMS_WORKLOAD_REPOSITORY.awr_report_text( l_dbid =\u0026gt; (SELECT dbid FROM v$database), l_inst_num =\u0026gt; 1, l_bid =\u0026gt; 4523, -- begin snapshot l_eid =\u0026gt; 4524 -- end snapshot )); What the report said #The Top 5 Timed Foreground Events section was telling:\nEvent Waits Time (s) % DB time db file scattered read 1,247,832 3,847 58.2% db file sequential read 423,109 1,205 18.2% CPU + Wait for CPU — 892 13.5% log file sync 12,445 287 4.3% direct path read 8,221 198 3.0% db file scattered read at 58%. Those are full table scans . Something was reading entire tables, block by block, without using indexes.\nThe SQL ordered by Elapsed Time section showed a single SQL_ID consuming 71% of total database time: g4f2h8k1nw3z9.\nNow I knew what to look for.\n🔍 ASH: the microscope #AWR had given me the big picture. But I needed to understand when that SQL started, who was running it, and which program had launched it.\nASH — Active Session History — records the state of every active session once per second. It is the DBA\u0026rsquo;s microscope: where AWR shows you averages over an hour, ASH shows you what was happening second by second.\nSELECT sample_time, session_id, sql_id, sql_plan_hash_value, event, program, module FROM v$active_session_history WHERE sql_id = \u0026#39;g4f2h8k1nw3z9\u0026#39; AND sample_time \u0026gt; SYSDATE - 1/24 ORDER BY sample_time DESC; The results were clear:\nProgram: JDBC Thin Client — the Java batch application Module: BatchVerificaProduzione Event: db file scattered read in 92% of samples First occurrence: 5:12 PM — right after the afternoon deploy SQL_PLAN_HASH_VALUE: 2891047563 The execution plan had changed. Before the deploy, that query used a different plan.\n🧩 The execution plan #I retrieved the current plan:\nSELECT * FROM TABLE(DBMS_XPLAN.display_awr( sql_id =\u0026gt; \u0026#39;g4f2h8k1nw3z9\u0026#39;, plan_hash_value =\u0026gt; 2891047563 )); The result made the problem immediately obvious:\n--------------------------------------------------------------------------- | Id | Operation | Name | Rows | Cost | --------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | | 48721 | | 1 | HASH JOIN | | 2.1M | 48721 | | 2 | TABLE ACCESS FULL | MOVIMENTI_TEMP | 2.1M | 41893 | | 3 | INDEX RANGE SCAN | IDX_CLIENTI_PK | 1 | 2 | --------------------------------------------------------------------------- TABLE ACCESS FULL on MOVIMENTI_TEMP. A temporary table with 2.1 million rows, read in full every time. No index. No effective filter.\nI checked what existed before the deploy by looking at the previous plan in AWR:\nSELECT plan_hash_value, timestamp FROM dba_hist_sql_plan WHERE sql_id = \u0026#39;g4f2h8k1nw3z9\u0026#39; ORDER BY timestamp; The previous plan (hash 1384726091) used an INDEX RANGE SCAN on an index that — as it turned out — had been dropped during the deploy. The migration script included a DROP TABLE MOVIMENTI_TEMP followed by a recreate, but without recreating the index.\n⚡ The fix #Ten minutes. From the moment I connected to the moment I identified the cause. Not because of skill — because of the tools.\nThe fix was straightforward:\nCREATE INDEX idx_movimenti_temp_cliente ON movimenti_temp (id_cliente, data_movimento) TABLESPACE idx_data; After creating the index, I forced a re-parse of the query:\nEXEC DBMS_SHARED_POOL.purge(\u0026#39;g4f2h8k1nw3z9\u0026#39;, \u0026#39;C\u0026#39;); I asked the team to relaunch the batch. Execution time: 18 minutes. Identical to previous tests.\nThe Saturday morning go-live proceeded as planned.\n📋 AWR vs ASH: when to use which #After that episode I formalised a rule I always follow:\nCharacteristic AWR ASH Granularity Hourly snapshots (configurable) Sample every second Historical depth Up to 30 days (default 8) 1 hour in memory, then in AWR Primary use case Trend analysis, period comparison Pinpoint diagnosis, SQL isolation Primary view DBA_HIST_* V$ACTIVE_SESSION_HISTORY Historical view — DBA_HIST_ACTIVE_SESS_HISTORY Licence required Diagnostic Pack Diagnostic Pack Typical output HTML/text report Ad hoc queries The rule of thumb: AWR to understand what changed, ASH to understand why.\nAWR tells you: \u0026ldquo;Between 5:00 PM and 6:00 PM, 58% of database time was spent on full table scans.\u0026rdquo; ASH tells you: \u0026ldquo;At 5:12:34 PM, session 847 was executing query g4f2h8k1nw3z9 with a full table scan on MOVIMENTI_TEMP, launched by the program BatchVerificaProduzione.\u0026rdquo;\nThey are complementary. Using only one is like diagnosing a problem by looking only at the CT scan or only at the blood tests.\n🛡️ Queries every DBA should have ready #Over the years I\u0026rsquo;ve built a set of diagnostic queries that I always keep at hand. I share them because in an emergency there is no time to write them from scratch.\nTop SQL by execution time (last hour) #SELECT sql_id, COUNT(*) AS samples, ROUND(COUNT(*) / 60, 1) AS est_minutes, MAX(event) AS top_event, MAX(program) AS program FROM v$active_session_history WHERE sample_time \u0026gt; SYSDATE - 1/24 AND sql_id IS NOT NULL GROUP BY sql_id ORDER BY samples DESC FETCH FIRST 10 ROWS ONLY; Wait event distribution for a specific SQL #SELECT event, COUNT(*) AS samples, ROUND(COUNT(*) * 100 / SUM(COUNT(*)) OVER(), 1) AS pct FROM v$active_session_history WHERE sql_id = \u0026#39;\u0026amp;sql_id\u0026#39; AND sample_time \u0026gt; SYSDATE - 1/24 GROUP BY event ORDER BY samples DESC; Execution plan comparison over time #SELECT plan_hash_value, MIN(timestamp) AS first_seen, MAX(timestamp) AS last_seen, COUNT(*) AS executions_in_awr FROM dba_hist_sqlstat WHERE sql_id = \u0026#39;\u0026amp;sql_id\u0026#39; GROUP BY plan_hash_value ORDER BY first_seen; 🎯 What I learned that evening #Three lessons I carry with me.\nFirst: a deploy is not just code. It is also structure. When you release to production, you must verify that indexes, constraints, statistics and grants are consistent with what was there before. A script that does DROP TABLE and CREATE TABLE without recreating the indexes is a time bomb.\nSecond: AWR and ASH are not tools for senior DBAs. They are front-line tools, like a defibrillator. You need to know how to use them before you need them, not during the emergency.\nThird: ten minutes of correct diagnosis are worth more than three hours of blind attempts. When the system is on its knees, the temptation is to restart, kill sessions, throw more resources at it. But without knowing what is happening, you are shooting in the dark.\nThat evening I left the office at 7:20 PM. Forty minutes after the phone call. The next day the go-live went ahead without a hitch, and on Monday the system was running smoothly.\nI\u0026rsquo;m not a hero. I just used the right tools.\nGlossary #AWR — Automatic Workload Repository. A built-in Oracle component that collects performance statistics through periodic snapshots and generates comparative diagnostic reports.\nASH — Active Session History. An Oracle component that samples the state of every active session once per second, storing it in memory and then in AWR. It is the DBA\u0026rsquo;s microscope for pinpoint diagnosis.\nFull Table Scan — A read operation where Oracle reads every block of a table without using indexes. In wait events it shows up as db file scattered read.\nWait Event — A diagnostic event recorded by Oracle whenever a session cannot proceed because it is waiting for a resource (I/O, lock, CPU, network). Wait event analysis is the foundation of Oracle\u0026rsquo;s diagnostic methodology.\nSnapshot — A point-in-time capture of performance statistics taken periodically by AWR (every 60 minutes by default). Comparing two snapshots generates the AWR report.\n","date":"10 February 2026","permalink":"https://ivanluminaria.com/en/posts/oracle/oracle-awr-ash/","section":"Database Strategy","summary":"\u003cp\u003eFriday, 6:40 PM. I already had my jacket on, ready to leave. The phone buzzes. It\u0026rsquo;s the project manager.\u003c/p\u003e\n\u003cp\u003e\u0026ldquo;Ivan, we have a problem. The system is crawling. The go-live is tomorrow morning.\u0026rdquo;\u003c/p\u003e\n\u003cp\u003eIt\u0026rsquo;s not the first time I\u0026rsquo;ve received a call like that. But the tone was different. This wasn\u0026rsquo;t the usual vague complaint about slowness. This was panic.\u003c/p\u003e\n\u003cp\u003eI reconnect to the VPN, open a session on the client\u0026rsquo;s Oracle 19c database. First thing I do is a quick check:\u003c/p\u003e","title":"AWR, ASH and the 10 minutes that saved a go-live"},{"content":"","date":null,"permalink":"https://ivanluminaria.com/en/tags/diagnostic/","section":"Tags","summary":"","title":"Diagnostic"},{"content":"","date":null,"permalink":"https://ivanluminaria.com/en/tags/go-live/","section":"Tags","summary":"","title":"Go-Live"},{"content":"","date":null,"permalink":"https://ivanluminaria.com/en/tags/grant/","section":"Tags","summary":"","title":"Grant"},{"content":"","date":null,"permalink":"https://ivanluminaria.com/en/tags/privileges/","section":"Tags","summary":"","title":"Privileges"},{"content":"","date":null,"permalink":"https://ivanluminaria.com/en/tags/revoke/","section":"Tags","summary":"","title":"Revoke"},{"content":"","date":null,"permalink":"https://ivanluminaria.com/en/tags/roles/","section":"Tags","summary":"","title":"Roles"},{"content":"The first time I seriously worked with PostgreSQL I was coming from years of other databases. I looked for the CREATE USER command. I found it. Then I saw CREATE ROLE. Then ALTER USER. Then ALTER ROLE.\nFor a few minutes I thought: \u0026ldquo;Alright, someone here enjoys confusing people.\u0026rdquo;\nActually, no. PostgreSQL is far more consistent than it appears. It is just consistent in its own way.\nIn PostgreSQL there are no users. There are roles. #The key is this: in PostgreSQL everything is a ROLE.\nA ROLE can:\nhave the right to login\\ not have the right to login\\ own objects\\ inherit privileges from other roles\\ be used as a container of privileges What in other databases you call a \u0026ldquo;user\u0026rdquo; in PostgreSQL is simply a role with the LOGIN attribute.\nIn fact:\nCREATE USER mario; is nothing more than a shortcut for:\nCREATE ROLE mario WITH LOGIN; Same for ALTER USER: it is only an alias of ALTER ROLE.\nWhy does only CREATE ROLE and ALTER ROLE really exist?\nBecause PostgreSQL does not conceptually distinguish between user and role. It is the same object with different attributes. Minimalist. Elegant. Consistent.\nIf a role has LOGIN, it behaves like a user.\nIf it does not have LOGIN, it is a container of privileges.\nWhen you truly understand this, the way you design security changes.\nThe correct mental model #Today I reason like this:\nI create \u0026ldquo;functional\u0026rdquo; roles that represent sets of privileges\\ I assign those roles to real users\\ I avoid granting permissions directly to users Why? Because users change. Roles do not.\nIf tomorrow a new colleague joins, I do not rewrite half the database grants.\nI assign the correct role and that is it.\nClean architecture. No magic. No chaos.\nA real story (without embarrassing names) #Some time ago I was asked to create a read-only user for a monitoring system.\nSeemingly simple request: \u0026ldquo;It must read some tables. No writing.\u0026rdquo;\nThe classic \u0026ldquo;it’s just read-only\u0026rdquo;.\nThe trap is always the same: if you only run a GRANT SELECT on existing tables, it works today.\nThree months later someone creates a new table and monitoring starts throwing errors.\nAnd guess who gets called.\nThe correct solution requires attention at four levels:\nPermission to connect to the database\\ Permission to use the schema (USAGE)\\ SELECT permissions on existing tables and sequences\\ Default privileges for future objects If you skip a piece, sooner or later you pay the price.\nExample: creating a proper read-only user #Suppose we want to create a read-only user on two schemas.\nFirst I create the role with login:\nCREATE ROLE srv_monitoring WITH LOGIN PASSWORD \u0026#39;SecurePassword123#\u0026#39;; I lock it down:\nALTER ROLE srv_monitoring NOSUPERUSER NOCREATEDB NOCREATEROLE NOINHERIT; Allow connection to the database:\nGRANT CONNECT ON DATABASE mydb TO srv_monitoring; Allow usage on schemas:\nGRANT USAGE ON SCHEMA schema1 TO srv_monitoring; GRANT USAGE ON SCHEMA schema2 TO srv_monitoring; Grant read permissions on existing objects:\nGRANT SELECT ON ALL TABLES IN SCHEMA schema1 TO srv_monitoring; GRANT SELECT ON ALL TABLES IN SCHEMA schema2 TO srv_monitoring; GRANT SELECT ON ALL SEQUENCES IN SCHEMA schema1 TO srv_monitoring; GRANT SELECT ON ALL SEQUENCES IN SCHEMA schema2 TO srv_monitoring; And now the part many people forget:\nALTER DEFAULT PRIVILEGES IN SCHEMA schema1 GRANT SELECT ON TABLES TO srv_monitoring; ALTER DEFAULT PRIVILEGES IN SCHEMA schema2 GRANT SELECT ON TABLES TO srv_monitoring; This way future tables will also be readable.\nImportant note: ALTER DEFAULT PRIVILEGES applies to the role that creates the objects. If multiple owners create tables in the same schemas, the configuration must be replicated for each of them.\nWhy this model is powerful #The fact that everything is a ROLE allows you to build clean hierarchies.\nAdvanced example:\nCREATE ROLE role_readonly; GRANT SELECT ON ALL TABLES IN SCHEMA schema1 TO role_readonly; CREATE ROLE srv_monitoring WITH LOGIN PASSWORD \u0026#39;...\u0026#39;; GRANT role_readonly TO srv_monitoring; Now I can assign role_readonly to ten different users without duplicating grants.\nThis is design. Not just syntax.\nConclusion #PostgreSQL does not complicate the concept of user. It simplifies it.\nThere is only one type of object: the ROLE. It is up to us to use it well.\nIf you treat it as just a \u0026ldquo;user with a password\u0026rdquo;, it works.\nIf you use it as an architectural building block, it becomes a powerful tool to design clean, scalable and maintainable security.\nThe difference is not in the commands.\nIt is in the mental model you use when applying them.\nGlossary #ROLE — PostgreSQL\u0026rsquo;s fundamental entity that unifies the concept of user and permission group: a ROLE with LOGIN is a user, without LOGIN it is a privilege container.\nDEFAULT PRIVILEGES — PostgreSQL mechanism that automatically defines privileges to assign to all future objects created in a schema, avoiding the need to repeat GRANTs manually.\nSchema — Logical namespace within a database that groups tables, views, functions and other objects, enabling organization and permission separation.\nGRANT — SQL command to assign specific privileges to a user or role on databases, tables, or columns.\nLeast Privilege — Security principle that prescribes assigning to each user only the permissions strictly necessary to perform their function.\n","date":"10 February 2026","permalink":"https://ivanluminaria.com/en/posts/postgresql/postgresql_roles_and_users/","section":"Database Strategy","summary":"\u003cp\u003eThe first time I seriously worked with PostgreSQL I was coming from\nyears of other databases. I looked for the \u003ccode\u003eCREATE USER\u003c/code\u003e command. I found it.\nThen I saw \u003ccode\u003eCREATE ROLE\u003c/code\u003e. Then \u003ccode\u003eALTER USER\u003c/code\u003e. Then \u003ccode\u003eALTER ROLE\u003c/code\u003e.\u003cbr\u003e\nFor a few minutes I thought: \u0026ldquo;Alright, someone here enjoys confusing\npeople.\u0026rdquo;\u003c/p\u003e\n\u003cp\u003eActually, no. PostgreSQL is far more consistent than it appears.\nIt is just consistent in its own way.\u003c/p\u003e\n\u003ch2 id=\"in-postgresql-there-are-no-users-there-are-roles\" class=\"relative group\"\u003eIn PostgreSQL there are no users. There are roles. \u003cspan class=\"absolute top-0 w-6 transition-opacity opacity-0 -start-6 not-prose group-hover:opacity-100\"\u003e\u003ca class=\"group-hover:text-primary-300 dark:group-hover:text-neutral-700\" style=\"text-decoration-line: none !important;\" href=\"#in-postgresql-there-are-no-users-there-are-roles\" aria-label=\"Anchor\"\u003e#\u003c/a\u003e\u003c/span\u003e\u003c/h2\u003e\u003cp\u003eThe key is this: \u003cstrong\u003ein PostgreSQL everything is a ROLE\u003c/strong\u003e.\u003c/p\u003e","title":"Roles and Users in PostgreSQL: Why Everything Is (Only) a ROLE"},{"content":"","date":null,"permalink":"https://ivanluminaria.com/en/tags/security/","section":"Tags","summary":"","title":"Security"},{"content":"","date":null,"permalink":"https://ivanluminaria.com/en/tags/bug-fixing/","section":"Tags","summary":"","title":"Bug-Fixing"},{"content":"","date":null,"permalink":"https://ivanluminaria.com/en/tags/github/","section":"Tags","summary":"","title":"Github"},{"content":"","date":null,"permalink":"https://ivanluminaria.com/en/tags/software-evolution/","section":"Tags","summary":"","title":"Software-Evolution"},{"content":"A client calls me. Tense voice, measured words.\n\u0026ldquo;Ivan, we have a problem. Actually, we have the problem.\u0026rdquo;\nI know that tone. It\u0026rsquo;s the tone of someone who has already tried to fix things internally, failed, and is now looking for someone to tell them the truth without beating around the bush.\nThe problem is a management software — not a website, not an app — a critical system running important business processes. It\u0026rsquo;s a few years old. It grew fast, as always happens when business runs faster than architecture. And now everything has piled up: open bugs that nobody closes, change requests that nobody plans, developers working on different versions of the code without knowing what the others are doing.\nThe classic scenario that, on paper, \u0026ldquo;works.\u0026rdquo; But inside, it\u0026rsquo;s a minefield.\n🧠 The first meeting: understanding what\u0026rsquo;s really broken #When I enter a project like this, I don\u0026rsquo;t look at the code first.\nI look at the people. I look at how they communicate. I look at where information gets lost.\nThe team was made up of four solid developers. Serious. Competent.\nBut they worked like this:\nthe code lived on a shared network folder changes were communicated via email or on an Excel spreadsheet bugs were reported verbally, in chat, via tickets — with no consistent method nobody knew for certain which was the \u0026ldquo;good\u0026rdquo; version of the software And you know what happens in these situations?\nEveryone is right from their own point of view. But the project, as a whole, is out of control.\nThe problem isn\u0026rsquo;t technical. It\u0026rsquo;s organizational.\nAnd that changes everything.\n📌 The proposal: GitHub as the project\u0026rsquo;s backbone #The first thing I put on the table was clear, direct, no frills:\nWe adopt GitHub. All code goes through there. No exceptions.\nIt\u0026rsquo;s not about trends. It\u0026rsquo;s not because \u0026ldquo;everyone does it.\u0026rdquo;\nIt\u0026rsquo;s because GitHub solves, with concrete tools, problems that no Excel spreadsheet can ever manage:\nReal versioning : every change is tracked, commented, reversible Branches and Pull Requests : every developer works on their own copy, then proposes changes to the team — they don\u0026rsquo;t overwrite each other\u0026rsquo;s work Integrated issue tracker : bugs and feature requests live in the same place as the code Complete history: who did what, when, why I saw the senior developer\u0026rsquo;s face. A mix of curiosity and skepticism.\n\u0026ldquo;But we\u0026rsquo;ve always done it this way.\u0026rdquo;\nI answered calmly: \u0026ldquo;I know. And the result is the reason I\u0026rsquo;m here.\u0026rdquo;\nI didn\u0026rsquo;t say it to provoke. I said it because it\u0026rsquo;s the truth.\nAnd the truth, when said the right way, doesn\u0026rsquo;t offend. It liberates.\n🔬 The second step: AI as an accelerator, not a replacement #Once the workflow on GitHub was defined — branches, reviews, controlled merges — I made my second proposal.\n\u0026ldquo;Let\u0026rsquo;s integrate artificial intelligence into the bug resolution process.\u0026rdquo;\nSilence.\nI understand the reaction. When you say \u0026ldquo;AI\u0026rdquo; in a room full of developers, half think of ChatGPT generating random code, the other half think you\u0026rsquo;re telling them their job is no longer needed.\nNeither of those things.\nWhat I proposed is very different:\nWhen a developer picks up a bug, before writing a single line of code, they use AI to analyze the context The AI reads the code involved, the logs, the problem description It proposes hypotheses. Not definitive solutions — reasoned hypotheses The developer evaluates, verifies, and then implements AI doesn\u0026rsquo;t replace the programmer.\nAI saves them the first two hours of analysis — those hours spent reading code written by someone else, trying to figure out what on earth is going on.\nAnd those two hours, multiplied by every bug, every developer, every week, become a number that changes the project\u0026rsquo;s economics.\n📊 The numbers I put on the table #I didn\u0026rsquo;t sell dreams. I presented conservative estimates.\nThe team handled an average of 15-20 bugs per week.\nAverage resolution time was about 6 hours per bug (between analysis, fix, testing, deploy).\nWith the introduction of GitHub + AI into the workflow, my estimate was:\nMetric Before After (estimate) Average bug analysis time ~2.5 hours ~15/20 minutes Total resolution time ~6 hours ~30 minutes Bugs resolved per week 15-20 180-240 Code conflicts frequent rare Project status visibility none complete A reduction of over 90% in total resolution time.\nA 12x increase in the team\u0026rsquo;s capacity to close tickets.\nWithout hiring anyone. Without changing the people. By changing the method.\n🛠️ How it works in practice #The workflow I designed is simple. Intentionally simple.\n1. The bug arrives as an Issue on GitHub\nClear title, description, priority label. No more emails, no more chat.\n2. The developer creates a dedicated branch\nfix/issue-234-vat-calculation-error — the name says it all.\n3. Before touching the code, they query the AI\nThey pass it the relevant code, the error, the context. The AI returns a structured analysis: where the problem might be, which files are involved, which tests to verify.\n4. The developer implements the fix\nWith a huge advantage: they already know where to look.\n5. Pull Request with review A colleague reviews the code. Not for formality — for quality.\n6. Merge into the main branch\nOnly after approval. The \u0026ldquo;good\u0026rdquo; code stays good.\n7. The Issue closes automatically\nComplete traceability. From problem to solution, everything documented.\n📈 What changed after three weeks #The first days were the hardest. Always are.\nNew tools, new habits, the temptation to go back to \u0026ldquo;how we used to do things.\u0026rdquo;\nBut after three weeks, something happened.\nThe senior developer — the one who had looked at me with skepticism — wrote to me:\n\u0026ldquo;Ivan, yesterday I fixed a bug that last year had blocked me for two days. With AI, it took me forty minutes. Not because the AI wrote the code. But because it showed me right away where the problem was.\u0026rdquo;\nThere it is. That\u0026rsquo;s the point.\nAI doesn\u0026rsquo;t write better code than an experienced developer.\nAI accelerates the path between the problem and understanding the problem.\nAnd understanding is always the most expensive step.\n🎯 The lesson I take home #Every time I enter a struggling project, I find the same pattern:\nCompetent people Inadequate tools Absent or informal processes Growing frustration The solution is never \u0026ldquo;work harder.\u0026rdquo;\nThe solution is work differently.\nGitHub isn\u0026rsquo;t a tool for developers. It\u0026rsquo;s a tool for teams.\nAI isn\u0026rsquo;t a toy. It\u0026rsquo;s a competence multiplier.\nBut neither works if there isn\u0026rsquo;t someone looking at the project from above, understanding where the hours are being lost, and having the courage to say: \u0026ldquo;Let\u0026rsquo;s change.\u0026rdquo;\n💬 To those who recognize themselves in this story #If you\u0026rsquo;re managing a software project and you see yourself in what I\u0026rsquo;ve described — code scattered everywhere, bugs that keep coming back, a team that works hard but closes little — know that it\u0026rsquo;s not the people\u0026rsquo;s fault.\nIt\u0026rsquo;s the fault of the system they work in.\nAnd the system can be changed.\nIt must be changed.\nYou don\u0026rsquo;t need revolutions. You need precise choices, implemented with method.\nA shared repository. A clear workflow. An intelligent assistant that accelerates analysis.\nThree things. Three decisions.\nThat transform chaos into control.\nGlossary #Pull Request — Formal request to incorporate changes from a branch into the main branch, with mandatory code review. The mechanism that ensures \u0026ldquo;good\u0026rdquo; code stays good.\nVersion Control — System that tracks every code change, maintaining complete history. Git is the standard; GitHub adds collaboration on top of Git.\nIssue Tracker — Integrated tracking system for bugs and feature requests. On GitHub, issues live in the same place as the code, with traceability from problem to solution.\nCode Review — Code review by a colleague before merge. Catches bugs, improves quality and spreads codebase knowledge across the team.\nBranch — Independent development line that allows working on isolated changes without affecting the main code until approved merge.\n","date":"3 February 2026","permalink":"https://ivanluminaria.com/en/posts/project-management/ai-github-project-management/","section":"Database Strategy","summary":"\u003cp\u003eA client calls me. Tense voice, measured words.\u003c/p\u003e\n\u003cblockquote\u003e\n\u003cp\u003e\u0026ldquo;Ivan, we have a problem. Actually, we have \u003cstrong\u003ethe\u003c/strong\u003e problem.\u0026rdquo;\u003c/p\u003e\n\u003c/blockquote\u003e\n\u003cp\u003eI know that tone. It\u0026rsquo;s the tone of someone who has already tried to fix things internally, failed, and is now looking for someone to tell them the truth without beating around the bush.\u003c/p\u003e\n\u003cp\u003eThe problem is a management software — not a website, not an app — a critical system running important business processes. It\u0026rsquo;s a few years old. It grew fast, as always happens when business runs faster than architecture. And now everything has piled up: open bugs that nobody closes, change requests that nobody plans, developers working on different versions of the code without knowing what the others are doing.\u003c/p\u003e","title":"When chaos becomes method: AI and GitHub to manage a project nobody wanted to touch"},{"content":"","date":null,"permalink":"https://ivanluminaria.com/en/tags/workflow/","section":"Tags","summary":"","title":"Workflow"},{"content":"","date":null,"permalink":"https://ivanluminaria.com/en/tags/agile/","section":"Tags","summary":"","title":"Agile"},{"content":"","date":null,"permalink":"https://ivanluminaria.com/en/tags/audit/","section":"Tags","summary":"","title":"Audit"},{"content":"","date":null,"permalink":"https://ivanluminaria.com/en/tags/meeting/","section":"Tags","summary":"","title":"Meeting"},{"content":"","date":null,"permalink":"https://ivanluminaria.com/en/tags/scrum/","section":"Tags","summary":"","title":"Scrum"},{"content":"","date":null,"permalink":"https://ivanluminaria.com/en/tags/standup/","section":"Tags","summary":"","title":"Standup"},{"content":"First Monday of the project. New team, new methodology, new hopes. The PM proposes a daily standup. Everyone nods. \u0026ldquo;Fifteen minutes, standing up, three questions. Simple.\u0026rdquo;\nThe first week works. At 9:15 it starts, by 9:28 everyone is back at their desk. Each person speaks for two minutes, blockers are flagged, people move on. Pure efficiency.\nThe second week someone raises a hand mid-round: \u0026ldquo;Can I quickly explain the problem I\u0026rsquo;m having with the integration?\u0026rdquo; Five minutes of technical discussion between two people. The other six stand there listening to something that doesn\u0026rsquo;t concern them.\nThe third week the standup lasts thirty-five minutes. Someone brings a laptop. Someone else sits down. The three-question round has become a status meeting with open discussions, improvised demos and architectural debates.\nBy the fourth week the team starts skipping the standup. \u0026ldquo;It lasts half an hour anyway, I don\u0026rsquo;t have time.\u0026rdquo;\nI\u0026rsquo;ve seen this sequence at least ten times in my career. It\u0026rsquo;s not bad luck. It\u0026rsquo;s a pattern.\n⏱️ Why the 15-minute constraint is non-negotiable #A standup has one purpose: synchronise the team. It is not an analysis meeting. It is not a problem-solving session. It is not a design workshop. It is a quick alignment checkpoint.\nAnd the time constraint is what makes it so.\nWhen a standup lasts 15 minutes, specific things happen:\nPeople prepare before the meeting, because they know they have two minutes Problems are flagged, not solved. Resolution happens afterwards, between the people involved The team maintains the perception that the standup is useful and respectful of their time Nobody walks in thinking \u0026ldquo;here we go, another half hour wasted\u0026rdquo; When the standup runs past 20 minutes, the mechanism breaks:\nDuration Effect on the team 10-15 min High focus, active participation, positive perception 15-20 min Acceptable, but some people start drifting 20-30 min People not involved in the long threads mentally check out 30-45 min The team sees the standup as a waste of time. Absences begin 45+ min The standup is dead. It has become a status meeting dressed up as an agile practice The most dangerous thing is not the overrun itself. It\u0026rsquo;s that it happens gradually. Three extra minutes today, five tomorrow. Nobody notices until it\u0026rsquo;s too late.\n❓ The three questions — and nothing else #The classic standup is built on three questions:\nWhat did I do yesterday? What will I do today? Is anything blocking me? Simple. But simplicity is treacherous, because the temptation to expand is constant.\n\u0026ldquo;What I did yesterday\u0026rdquo; doesn\u0026rsquo;t mean retelling your day. It means saying: \u0026ldquo;I finished the lookup table migration\u0026rdquo; or \u0026ldquo;I worked on bug #247, haven\u0026rsquo;t resolved it yet.\u0026rdquo; Ten seconds, not three minutes.\n\u0026ldquo;What I\u0026rsquo;ll do today\u0026rdquo; is not a detailed plan. It\u0026rsquo;s a statement of intent: \u0026ldquo;Today I\u0026rsquo;ll finish bug #247 and start integration testing.\u0026rdquo;\n\u0026ldquo;Is anything blocking me\u0026rdquo; is the most important question. Because this is where dependencies surface, bottlenecks appear, problems that one person alone cannot solve. But — and this is crucial — the blocker is flagged, not resolved live.\nWhen someone says \u0026ldquo;I\u0026rsquo;m blocked because I don\u0026rsquo;t have access to the staging environment\u0026rdquo;, the correct response is not a fifteen-minute discussion about who should grant access, how to configure it and why it didn\u0026rsquo;t work yesterday. The correct response is: \u0026ldquo;OK, let\u0026rsquo;s talk after the standup, you and me.\u0026rdquo;\nThis discipline is what keeps the standup under 15 minutes. Without it, every blocker becomes a meeting inside the meeting.\n💀 When the standup dies #I\u0026rsquo;ve identified a fairly precise list of the ways a standup can die. I list them not out of pessimism, but because recognising them is the only way to prevent them.\nThe thread killer #One person describes a complex technical problem. Another person responds. A dialogue starts between two people, while six others stand idle. The facilitator doesn\u0026rsquo;t intervene because \u0026ldquo;it\u0026rsquo;s an important topic\u0026rdquo;. Fifteen minutes gone.\nThe improvised demo #\u0026ldquo;Wait, let me show you what I did.\u0026rdquo; Screen share, application walkthrough, UI detail explanations. Interesting? Maybe. Relevant to the standup? No.\nThe manager who asks questions #The PM or team lead starts probing: \u0026ldquo;Is that feature at 60% or 70%? When do you expect to finish? Have you talked to the client?\u0026rdquo; The standup turns into an individual status report.\nThe missing facilitator #Without someone keeping the pace, the standup becomes a free-form conversation. Free-form conversations are lovely at the pub, not at 9:15 in the morning when eight people have work to do.\nThe open laptop #When people bring laptops to the standup, the implicit message is: \u0026ldquo;This meeting doesn\u0026rsquo;t deserve my full attention.\u0026rdquo; And they\u0026rsquo;re right — if the standup lasts 40 minutes, it doesn\u0026rsquo;t.\n🛠️ How to make a standup actually work #After twenty years of projects, here\u0026rsquo;s my recipe. It\u0026rsquo;s not elegant, it\u0026rsquo;s not textbook, but it works.\n1. Visible timer #A timer on the shared screen (or a phone placed on the table) that starts when the standup begins. Everyone sees it. When it hits 15 minutes, the standup ends. Full stop.\nIt\u0026rsquo;s not authoritarian. It\u0026rsquo;s a team agreement. The timer is not the enemy — it\u0026rsquo;s the guardian of everyone\u0026rsquo;s time.\n2. Facilitator with the mandate to cut #You need a person — rotating or permanent — whose sole job is to say: \u0026ldquo;OK, we\u0026rsquo;ll dig into that after. Next.\u0026rdquo; It\u0026rsquo;s not rudeness. It\u0026rsquo;s respect for the six people who are waiting.\nThe best facilitator does it naturally: \u0026ldquo;Interesting, let\u0026rsquo;s discuss right after. Marco, your turn.\u0026rdquo;\n3. Standing up, for real #It\u0026rsquo;s not folklore. Standing has a concrete psychological effect: people want to finish quickly. When you sit down, you relax. When you stand, you tend towards brevity.\nIf the team is remote, the principle translates to: cameras on, no multitasking. The signal should be: \u0026ldquo;These 15 minutes have my full attention.\u0026rdquo;\n4. No laptops, no screen sharing #The standup is verbal. If something requires a demo, a diagram, a visual explanation — that\u0026rsquo;s not standup material. That\u0026rsquo;s a separate meeting, with the right people.\n5. Parking lot #Every time a topic comes up that deserves deeper discussion, the facilitator writes it on a visible list — the \u0026ldquo;parking lot\u0026rdquo;. After the standup, the people involved stay and discuss. Everyone else goes to work.\nThe parking lot is the most underrated tool in standup management. It lets you say \u0026ldquo;we\u0026rsquo;ll discuss that later\u0026rdquo; without the topic being forgotten.\n📊 The standup in numbers #Let\u0026rsquo;s do a calculation nobody ever does.\nA team of 8 people. Daily standup. 220 working days per year.\nScenario Duration Hours/person/year Total team hours/year 15-minute standup 15 min 55 hours 440 hours 30-minute standup 30 min 110 hours 880 hours 45-minute standup 45 min 165 hours 1,320 hours The difference between a well-managed standup and one that\u0026rsquo;s out of control is 880 hours per year. For a team of 8 people. That\u0026rsquo;s 110 working days. Nearly five person-months.\nAnd that\u0026rsquo;s without counting the indirect effect: a 45-minute standup doesn\u0026rsquo;t just steal 45 minutes. It steals the 10-15 minutes of focus needed afterwards to get back into the flow.\n🔄 Remote standups vs in-person #Since 2020 standups are often remote. The medium changes, but the principles stay the same. With a few extra precautions.\nRemote is worse (if you\u0026rsquo;re not careful) # Audio latency creates overlaps that stretch the time Multitasking is invisible (but real) The lack of body language makes it harder for the facilitator to know when to cut Screen sharing is one click away, and the temptation to use it is strong How to manage a remote standup # Practice Reason Predefined speaking order Avoids the \u0026ldquo;who\u0026rsquo;s next?\u0026rdquo; and awkward silences Cameras on Signals presence and attention Chat for the parking lot Captures topics in real time without interrupting Shared timer on screen Same principle as in-person standups Everyone muted except the speaker Eliminates background noise and interrupt temptation The most effective trick I\u0026rsquo;ve found for remote standups is the relay round: each person, after speaking, names the next one. \u0026ldquo;I\u0026rsquo;m done. Sara, you\u0026rsquo;re up.\u0026rdquo; This keeps attention active and gives rhythm to the meeting.\n🎯 The standup is a tool, not a ritual #What has always struck me is how easily the standup becomes an empty ritual. You do it because \u0026ldquo;that\u0026rsquo;s what you do\u0026rdquo;, because \u0026ldquo;we\u0026rsquo;re agile\u0026rdquo;, because \u0026ldquo;the framework says so\u0026rdquo;. But nobody asks any more: is it working?\nA standup works when the team perceives it as useful. When at 9:15 people show up willingly, say their piece in two minutes, listen to others, and by 9:30 they\u0026rsquo;re at their desk knowing exactly what\u0026rsquo;s happening in the project.\nA standup doesn\u0026rsquo;t work when people see it as an obligation. When they sigh looking at the clock. When they check their phones. When they think \u0026ldquo;I could have used that half hour to work.\u0026rdquo;\nThe difference between the two scenarios is almost always the same: whether the 15-minute constraint is respected or not.\nYou don\u0026rsquo;t need sophisticated frameworks. You don\u0026rsquo;t need certifications. You need a timer, a facilitator with a backbone, and the awareness that people\u0026rsquo;s time has value.\nFifteen minutes. Three questions. Parking lot for the rest.\nEverything else is noise.\nGlossary #Daily Standup — Daily meeting of maximum 15 minutes where each team member answers three questions: what I did yesterday, what I will do today, what is blocking me.\nParking Lot — Visible list of topics that emerge during a meeting and deserve further discussion but are deferred to respect the timebox.\nFacilitator — Person responsible for guiding a meeting by maintaining focus, respecting the timebox, and ensuring everyone has a voice.\nTimeboxing — Time management technique that assigns a fixed, non-negotiable interval to an activity, forcing conclusion within the established limit.\nScrum — Agile framework for project management that organizes work into fixed-length sprints, with defined roles and structured ceremonies.\n","date":"27 January 2026","permalink":"https://ivanluminaria.com/en/posts/project-management/standup-meeting-15-minuti/","section":"Database Strategy","summary":"\u003cp\u003eFirst Monday of the project. New team, new methodology, new hopes. The PM proposes a daily standup. Everyone nods. \u0026ldquo;Fifteen minutes, standing up, three questions. Simple.\u0026rdquo;\u003c/p\u003e\n\u003cp\u003eThe first week works. At 9:15 it starts, by 9:28 everyone is back at their desk. Each person speaks for two minutes, blockers are flagged, people move on. Pure efficiency.\u003c/p\u003e\n\u003cp\u003eThe second week someone raises a hand mid-round: \u0026ldquo;Can I quickly explain the problem I\u0026rsquo;m having with the integration?\u0026rdquo; Five minutes of technical discussion between two people. The other six stand there listening to something that doesn\u0026rsquo;t concern them.\u003c/p\u003e","title":"Standup meetings: why they only work if they last 15 minutes"},{"content":"","date":null,"permalink":"https://ivanluminaria.com/en/tags/team-management/","section":"Tags","summary":"","title":"Team-Management"},{"content":"It has happened to me more than once: I walk into an Oracle environment and find the same situation. Every application user connected as the schema owner, with the DBA role granted. Developers, batch jobs, reporting tools — all running with the same privileges as the user that owns the tables.\nWhen you ask why, the answer is always some variation of: \u0026ldquo;This way everything works without permission issues.\u0026rdquo;\nSure. Everything works. Until the day a developer runs a DROP TABLE on the wrong table. Or a batch import does a TRUNCATE on a production table thinking it is in the test environment. Or someone runs a DELETE FROM customers without a WHERE clause.\nThat day the problem is no longer about permissions. It is that you have no idea who did what, and no tool to prevent it from happening again.\nThe context: a pattern that keeps repeating #The client was a mid-sized company with an ERP application running on Oracle 19c. About twenty users — a mix of developers, application accounts and operators. The application schema — let us call it APP_OWNER — held roughly 300 tables, about sixty views and a few dozen PL/SQL procedures.\nThe problem was easy to describe:\nEveryone connected as APP_OWNER APP_OWNER had the DBA role No audit configured No separation between readers and writers Passwords were shared via email It was not negligence. It was inertia. The system had grown that way over the years, and nobody had ever stopped to rethink the model. It worked, and that was enough.\nUntil an operator accidentally deleted an entire quarter\u0026rsquo;s invoicing data. No log, no trail, no identifiable culprit. Only a two-day-old backup and a data gap that took weeks to fill.\nHow Oracle security works: the model #Before describing what I did, it helps to understand how Oracle structures security. The model is different from PostgreSQL and MySQL, and the differences are not cosmetic.\nUser and schema: the same thing (almost) #In Oracle, creating a user means creating a schema. They are not separate concepts: the user APP_OWNER is also the schema APP_OWNER, and the objects created by that user live in that schema.\nCREATE USER app_read IDENTIFIED BY \u0026#34;PasswordSecure#2026\u0026#34; DEFAULT TABLESPACE users TEMPORARY TABLESPACE temp QUOTA 0 ON users; The QUOTA 0 is intentional: this user is not supposed to create objects. It is a consumer, not an owner.\nSystem privileges vs object privileges #Oracle draws a clear line between:\nSystem privileges: global operations like CREATE TABLE, CREATE SESSION, ALTER SYSTEM Object privileges: operations on specific objects like SELECT ON app_owner.customers, EXECUTE ON app_owner.pkg_invoices The DBA role includes over 200 system privileges. Granting it to an application user is like handing the keys to the entire building to someone who only needs to enter one room.\nRoles: predefined and custom #Oracle offers predefined roles (CONNECT, RESOURCE, DBA) and allows custom ones. The predefined roles carry a historical problem: CONNECT and RESOURCE used to include excessive privileges in older versions. From Oracle 12c onward they were trimmed, but the habit of granting them without a second thought dies hard.\nThe right path is building custom roles calibrated to actual needs.\nThe implementation: three roles, zero ambiguity #I designed three roles: read, write and application administration.\n1. Read-only role #CREATE ROLE app_read_role; -- Table privileges GRANT SELECT ON app_owner.customers TO app_read_role; GRANT SELECT ON app_owner.orders TO app_read_role; GRANT SELECT ON app_owner.invoices TO app_read_role; GRANT SELECT ON app_owner.products TO app_read_role; GRANT SELECT ON app_owner.transactions TO app_read_role; -- View privileges GRANT SELECT ON app_owner.v_sales_report TO app_read_role; GRANT SELECT ON app_owner.v_order_status TO app_read_role; In an environment with 300 tables you do not list them one by one manually. I used a PL/SQL block to generate the grants:\nBEGIN FOR t IN (SELECT table_name FROM dba_tables WHERE owner = \u0026#39;APP_OWNER\u0026#39;) LOOP EXECUTE IMMEDIATE \u0026#39;GRANT SELECT ON app_owner.\u0026#39; || t.table_name || \u0026#39; TO app_read_role\u0026#39;; END LOOP; END; / Simple, repeatable, and above all: documented. Because six months from now someone will need to understand what was done and why.\n2. Read-write role #CREATE ROLE app_write_role; -- Inherits everything from the read role GRANT app_read_role TO app_write_role; -- Adds DML on operational tables GRANT INSERT, UPDATE, DELETE ON app_owner.orders TO app_write_role; GRANT INSERT, UPDATE, DELETE ON app_owner.transactions TO app_write_role; GRANT INSERT, UPDATE ON app_owner.customers TO app_write_role; -- Execute permission on application procedures GRANT EXECUTE ON app_owner.pkg_orders TO app_write_role; GRANT EXECUTE ON app_owner.pkg_invoices TO app_write_role; Note: no DELETE on the customers table. Not because it is technically impossible, but because the application process calls for deactivation, not deletion. The privilege reflects the process, not convenience.\n3. Application administration role #CREATE ROLE app_admin_role; -- Inherits the write role GRANT app_write_role TO app_admin_role; -- Adds controlled DDL GRANT CREATE VIEW TO app_admin_role; GRANT CREATE PROCEDURE TO app_admin_role; GRANT CREATE SYNONYM TO app_admin_role; -- Can manage configuration tables GRANT INSERT, UPDATE, DELETE ON app_owner.parameters TO app_admin_role; GRANT INSERT, UPDATE, DELETE ON app_owner.lookup_types TO app_admin_role; No CREATE TABLE, no DROP ANY, no ALTER SYSTEM. The application admin manages logic, not physical structure.\nUser creation and role assignment #-- Reporting user (read-only) CREATE USER srv_report IDENTIFIED BY \u0026#34;RptSecure#2026\u0026#34; DEFAULT TABLESPACE users TEMPORARY TABLESPACE temp QUOTA 0 ON users; GRANT CREATE SESSION TO srv_report; GRANT app_read_role TO srv_report; -- Application user (read-write) CREATE USER srv_app IDENTIFIED BY \u0026#34;AppSecure#2026\u0026#34; DEFAULT TABLESPACE users TEMPORARY TABLESPACE temp QUOTA 0 ON users; GRANT CREATE SESSION TO srv_app; GRANT app_write_role TO srv_app; -- Application DBA (administration) CREATE USER dba_app IDENTIFIED BY \u0026#34;DbaSecure#2026\u0026#34; DEFAULT TABLESPACE users TEMPORARY TABLESPACE temp QUOTA 10M ON users; GRANT CREATE SESSION TO dba_app; GRANT app_admin_role TO dba_app; Each user has its own password, a specific role and a disk quota consistent with its purpose. srv_report has no quota because it should not create anything. dba_app gets 10 MB because it needs to create views and procedures.\nRevoking the DBA role #The most delicate step: removing DBA from APP_OWNER.\nREVOKE DBA FROM app_owner; One line. But before running it, I verified that APP_OWNER still had the privileges needed to own its objects:\nSELECT privilege FROM dba_sys_privs WHERE grantee = \u0026#39;APP_OWNER\u0026#39;; SELECT granted_role FROM dba_role_privs WHERE grantee = \u0026#39;APP_OWNER\u0026#39;; Then I granted only what was strictly necessary:\nGRANT CREATE SESSION TO app_owner; GRANT CREATE TABLE TO app_owner; GRANT CREATE VIEW TO app_owner; GRANT CREATE PROCEDURE TO app_owner; GRANT CREATE SEQUENCE TO app_owner; GRANT UNLIMITED TABLESPACE TO app_owner; APP_OWNER remains the owner of the objects but no longer has the power to do anything on the database. It is an owner, not a god.\nAudit: knowing who did what #Having the right roles is not enough. You need to know who did what, especially for critical operations.\nSince version 12c, Oracle offers Unified Audit, replacing the old traditional audit with a centralized system.\n-- Audit critical DDL operations CREATE AUDIT POLICY pol_critical_ddl ACTIONS CREATE TABLE, DROP TABLE, ALTER TABLE, TRUNCATE TABLE, CREATE USER, DROP USER, ALTER USER, GRANT, REVOKE; ALTER AUDIT POLICY pol_critical_ddl ENABLE; -- Audit sensitive data access CREATE AUDIT POLICY pol_data_access ACTIONS SELECT ON app_owner.customers, DELETE ON app_owner.invoices, UPDATE ON app_owner.invoices; ALTER AUDIT POLICY pol_data_access ENABLE; -- Audit failed logins CREATE AUDIT POLICY pol_failed_logins ACTIONS LOGON; ALTER AUDIT POLICY pol_failed_logins ENABLE WHENEVER NOT SUCCESSFUL; To check what is being recorded:\nSELECT * FROM unified_audit_trail WHERE event_timestamp \u0026gt; SYSDATE - 7 ORDER BY event_timestamp DESC; Audit is not paranoia. It is the only way to answer the question \u0026ldquo;who did what?\u0026rdquo; without relying on guesswork.\nComparison with PostgreSQL and MySQL #This article is the third in a series on security management in relational databases. The first two cover PostgreSQL and MySQL.\nThe differences among the three systems are substantial:\nAspect Oracle PostgreSQL MySQL User = schema? Yes No (independent) Yes (separate databases) Role model Predefined + custom Everything is a ROLE Roles from MySQL 8.0 Identity Username Username user@host pair Native audit Unified Audit (12c+) pgAudit (extension) Audit plugin Granular privileges System + Object Database/Schema/Object Global/DB/Table/Column GRANT ALL Exists but dangerous Exists, discouraged Exists, discouraged In PostgreSQL everything is a ROLE, and the simplicity of the model is its strength. In MySQL identity is tied to the originating host, adding a layer of complexity (and security) that the others lack. In Oracle the model is the richest and the most granular, but also the easiest to misconfigure because of the sheer number of options.\nThe principle remains the same everywhere: give each user only what they need, not one privilege more.\nWhat changed afterwards #The transition was gradual — two weeks for the full rollout, with testing on every application and procedure. A few scripts stopped working because they took for granted privileges they were never entitled to. Every error was actually a hidden problem that had been invisible before.\nThe result:\n20 named users instead of a single shared schema 3 custom roles instead of the DBA role Active audit on DDL and sensitive operations Zero incidents of accidental deletion in the following months The client did not notice performance improvements. That was not the goal. What they noticed was that when someone made a mistake, the damage was contained and traceable. And in a production environment, that is worth more than any optimization.\nConclusion #GRANT ALL PRIVILEGES and the DBA role are shortcuts. They work in the sense that they eliminate permission errors. But they also eliminate every layer of protection.\nSecurity in Oracle is not a tooling problem — the tools are there, and they are powerful. It is a design problem: deciding who can do what, documenting it, implementing it and then verifying that it works.\nIt is not the most glamorous work in the world. But it is the work that makes the difference between a database that merely survives and one that is truly under control.\nGlossary #System Privilege — Oracle privilege that authorizes global database operations such as CREATE TABLE, CREATE SESSION or ALTER SYSTEM, independent of any specific object.\nObject Privilege — Oracle privilege that authorizes operations on a specific database object such as SELECT, INSERT or EXECUTE on a table, view or procedure.\nREVOKE — SQL command to remove privileges or roles previously granted to a user or role, complementary to the GRANT command.\nUnified Audit — Centralized auditing system introduced in Oracle 12c that unifies all audit types into a single infrastructure, replacing the legacy traditional audit.\nLeast Privilege — Security principle that prescribes assigning to each user only the permissions strictly necessary to perform their function.\n","date":"27 January 2026","permalink":"https://ivanluminaria.com/en/posts/oracle/oracle-roles-privileges/","section":"Database Strategy","summary":"\u003cp\u003eIt has happened to me more than once: I walk into an Oracle environment and find the same situation. Every application user connected as the schema owner, with the DBA role granted. Developers, batch jobs, reporting tools — all running with the same privileges as the user that owns the tables.\u003c/p\u003e\n\u003cp\u003eWhen you ask why, the answer is always some variation of: \u0026ldquo;This way everything works without permission issues.\u0026rdquo;\u003c/p\u003e\n\u003cp\u003eSure. Everything works. Until the day a developer runs a \u003ccode\u003eDROP TABLE\u003c/code\u003e on the wrong table. Or a batch import does a \u003ccode\u003eTRUNCATE\u003c/code\u003e on a production table thinking it is in the test environment. Or someone runs a \u003ccode\u003eDELETE FROM customers\u003c/code\u003e without a \u003ccode\u003eWHERE\u003c/code\u003e clause.\u003c/p\u003e","title":"Users, Roles and Privileges in Oracle: Why GRANT ALL Is Never the Answer"},{"content":"","date":null,"permalink":"https://ivanluminaria.com/en/categories/data-warehouse/","section":"Categories","summary":"","title":"Data-Warehouse"},{"content":"","date":null,"permalink":"https://ivanluminaria.com/en/tags/dimensional-modeling/","section":"Tags","summary":"","title":"Dimensional-Modeling"},{"content":"","date":null,"permalink":"https://ivanluminaria.com/en/tags/etl/","section":"Tags","summary":"","title":"Etl"},{"content":"","date":null,"permalink":"https://ivanluminaria.com/en/tags/hierarchies/","section":"Tags","summary":"","title":"Hierarchies"},{"content":"","date":null,"permalink":"https://ivanluminaria.com/en/tags/olap/","section":"Tags","summary":"","title":"Olap"},{"content":"Three levels. Top Group, Group, Client. It looks like a trivial structure — the kind of hierarchy you draw on a whiteboard in five minutes and that any BI tool should handle without issues.\nThen you discover that not all clients belong to a group. And that not all groups belong to a top group. And that the aggregation reports the business asks for — revenue by top group, client count by group, drill-down from the top to the leaf — produce wrong or incomplete results because the hierarchy has holes.\nIn technical jargon it is called a ragged hierarchy : a hierarchy where not all branches reach the same depth. In the real world it is called \u0026ldquo;the problem nobody sees until they open the report and the numbers do not add up.\u0026rdquo;\nThe client and the original model #The project was a data warehouse for a company in the energy sector — gas distribution and related services. The source system managed a client master with a hierarchical structure: clients could be grouped under a commercial entity (the Group), and groups could in turn belong to a higher entity (the Top Group).\nThe model in the source was a single table with hierarchical references:\nCREATE TABLE stg_clienti ( client_id NUMBER(10) NOT NULL, client_name VARCHAR2(100) NOT NULL, group_id NUMBER(10), group_name VARCHAR2(100), top_group_id NUMBER(10), top_group_name VARCHAR2(100), revenue NUMBER(15,2), region VARCHAR2(50), CONSTRAINT pk_stg_clienti PRIMARY KEY (client_id) ); Here is a data sample:\nINSERT INTO stg_clienti VALUES (1001, \u0026#39;Rossi Energia Srl\u0026#39;, 10, \u0026#39;Consorzio Nord\u0026#39;, 100, \u0026#39;Holding Nazionale\u0026#39;, 125000.00, \u0026#39;Lombardia\u0026#39;); INSERT INTO stg_clienti VALUES (1002, \u0026#39;Bianchi Gas SpA\u0026#39;, 10, \u0026#39;Consorzio Nord\u0026#39;, 100, \u0026#39;Holding Nazionale\u0026#39;, 89000.00, \u0026#39;Piemonte\u0026#39;); INSERT INTO stg_clienti VALUES (1003, \u0026#39;Verdi Distribuzione\u0026#39;, 20, \u0026#39;Gruppo Centro\u0026#39;, 100, \u0026#39;Holding Nazionale\u0026#39;, 67000.00, \u0026#39;Toscana\u0026#39;); INSERT INTO stg_clienti VALUES (1004, \u0026#39;Neri Servizi\u0026#39;, 20, \u0026#39;Gruppo Centro\u0026#39;, NULL, NULL, 45000.00, \u0026#39;Lazio\u0026#39;); INSERT INTO stg_clienti VALUES (1005, \u0026#39;Gialli Utilities\u0026#39;, NULL, NULL, NULL, NULL, 38000.00, \u0026#39;Sicilia\u0026#39;); INSERT INTO stg_clienti VALUES (1006, \u0026#39;Blu Energia\u0026#39;, NULL, NULL, NULL, NULL, 52000.00, \u0026#39;Sardegna\u0026#39;); INSERT INTO stg_clienti VALUES (1007, \u0026#39;Viola Gas Srl\u0026#39;, 30, \u0026#39;Rete Sud\u0026#39;, NULL, NULL, 71000.00, \u0026#39;Campania\u0026#39;); INSERT INTO stg_clienti VALUES (1008, \u0026#39;Arancio Distribuzione\u0026#39;, 30, \u0026#39;Rete Sud\u0026#39;, NULL, NULL, 33000.00, \u0026#39;Calabria\u0026#39;); Look at the data carefully. There are four different situations:\nClient 1001, 1002, 1003: complete hierarchy — Client → Group → Top Group Client 1004: has a Group but the Group has no Top Group Client 1005, 1006: no Group, no Top Group — direct clients Client 1007, 1008: have a Group (Rete Sud) but the Group has no Top Group This is a ragged hierarchy. Three levels on paper, but in reality the branches have different depths.\nThe problem: the reports do not add up #The business asked for a simple report: revenue aggregated by Top Group, with drill-down capability by Group and then by Client. A reasonable request — the kind of thing you expect from any DWH.\nThe most natural query:\nSELECT top_group_name, group_name, COUNT(*) AS num_clients, SUM(revenue) AS total_revenue FROM stg_clienti GROUP BY top_group_name, group_name ORDER BY top_group_name, group_name; The result:\nTOP_GROUP_NAME GROUP_NAME NUM_CLIENTS TOTAL_REVENUE ------------------ ---------------- ----------- ------------- Holding Nazionale Consorzio Nord 2 214000.00 Holding Nazionale Gruppo Centro 1 67000.00 (null) Gruppo Centro 1 45000.00 (null) Rete Sud 2 104000.00 (null) (null) 2 90000.00 Five rows. And at least three problems.\nGruppo Centro appears twice: once under \u0026ldquo;Holding Nazionale\u0026rdquo; (client 1003 which has a top group) and once under NULL (client 1004 whose top group is NULL). The same group, split across two rows, with separate totals. Anyone looking at this report will think Gruppo Centro has 67K revenue under the holding and 45K somewhere else. In reality it is a single group with 112K total.\nThe direct clients (Gialli Utilities and Blu Energia) end up in a row with two NULLs. Management does not know what to do with a nameless row.\nThe Top Group total is wrong because the NULL rows are missing. If you sum only the rows with a top group, you lose 239K in revenue — 30% of the total.\nThe classic approach: COALESCE and prayers #The first reaction, the one I see in 90% of cases, is to add `COALESCE` to the query:\nSELECT COALESCE(top_group_name, group_name, client_name) AS top_group_name, COALESCE(group_name, client_name) AS group_name, client_name, revenue FROM stg_clienti; Does it work? In a sense yes — it fills the holes. But it introduces new problems.\nClient \u0026ldquo;Gialli Utilities\u0026rdquo; now appears as Top Group, Group and Client simultaneously. If the business wants to count how many Top Groups there are, the number is inflated. If they want to filter for \u0026ldquo;real\u0026rdquo; top groups, there is no way to distinguish them from clients promoted to top group by the COALESCE.\nAnd this is the simple case, with three levels. I have seen five-level hierarchies managed with chains of nested COALESCE, multiple CASE WHEN expressions, and report logic so convoluted that nobody dared touch it anymore. Every new business request required cascading changes across all queries.\nThe root problem is that COALESCE is a patch applied at the presentation layer. It does not fix the structural issue: the hierarchy is incomplete and the dimensional model does not know it.\nThe solution: self-parenting #The principle is simple: whoever has no parent becomes their own parent. This technique is called self-parenting .\nA Client without a Group? That client becomes its own Group. A Group without a Top Group? That group becomes its own Top Group. This way the hierarchy is always complete at three levels, with no holes, no NULLs.\nIt is not a trick. It is a standard technique in dimensional modeling, described by Kimball and used in production for decades. The idea is that the hierarchical dimension in the DWH must be balanced: every record must have a valid value for every level of the hierarchy. If the source does not guarantee it, the ETL does.\nThe dimensional table #CREATE TABLE dim_client_hierarchy ( client_key NUMBER(10) NOT NULL, client_id NUMBER(10) NOT NULL, client_name VARCHAR2(100) NOT NULL, group_id NUMBER(10) NOT NULL, group_name VARCHAR2(100) NOT NULL, top_group_id NUMBER(10) NOT NULL, top_group_name VARCHAR2(100) NOT NULL, region VARCHAR2(50), is_direct_client CHAR(1) DEFAULT \u0026#39;N\u0026#39;, is_standalone_group CHAR(1) DEFAULT \u0026#39;N\u0026#39;, CONSTRAINT pk_dim_client_hier PRIMARY KEY (client_key) ); Notice two things. First: no column is nullable. Group and Top Group are NOT NULL. Second: I added two flags — is_direct_client and is_standalone_group — that allow distinguishing artificially balanced records from those with a natural hierarchy. This is important: the business must be able to filter \u0026ldquo;real\u0026rdquo; top groups from promoted clients.\nThe ETL logic #INSERT INTO dim_client_hierarchy ( client_key, client_id, client_name, group_id, group_name, top_group_id, top_group_name, region, is_direct_client, is_standalone_group ) SELECT client_id AS client_key, client_id, client_name, -- If no group, the client becomes its own group COALESCE(group_id, client_id) AS group_id, COALESCE(group_name, client_name) AS group_name, -- If no top group, the group (or client) becomes its own top group COALESCE(top_group_id, group_id, client_id) AS top_group_id, COALESCE(top_group_name, group_name, client_name) AS top_group_name, region, CASE WHEN group_id IS NULL THEN \u0026#39;Y\u0026#39; ELSE \u0026#39;N\u0026#39; END AS is_direct_client, CASE WHEN group_id IS NOT NULL AND top_group_id IS NULL THEN \u0026#39;Y\u0026#39; ELSE \u0026#39;N\u0026#39; END AS is_standalone_group FROM stg_clienti; Look at the COALESCE cascade in the transformation. The logic is:\ngroup_id: if the client has a group, use it; otherwise use the client itself top_group_id: if there is a top group, use it; if not but there is a group, use the group; if there is no group either, use the client Every \u0026ldquo;missing\u0026rdquo; level is filled by the level immediately below. The result is a hierarchy that is always complete.\nThe result after balancing #SELECT client_key, client_name, group_name, top_group_name, is_direct_client, is_standalone_group FROM dim_client_hierarchy ORDER BY top_group_id, group_id, client_id; KEY CLIENT_NAME GROUP_NAME TOP_GROUP_NAME DIRECT STANDALONE ---- -------------------- ---------------- ------------------ ------ ---------- 1001 Rossi Energia Srl Consorzio Nord Holding Nazionale N N 1002 Bianchi Gas SpA Consorzio Nord Holding Nazionale N N 1003 Verdi Distribuzione Gruppo Centro Holding Nazionale N N 1004 Neri Servizi Gruppo Centro Gruppo Centro N Y 1007 Viola Gas Srl Rete Sud Rete Sud N Y 1008 Arancio Distribuzione Rete Sud Rete Sud N Y 1005 Gialli Utilities Gialli Utilities Gialli Utilities Y N 1006 Blu Energia Blu Energia Blu Energia Y N Eight rows, zero NULLs. Every client has a group and a top group. The flags tell the truth: Gialli and Blu are direct clients (self-parented at all levels), Gruppo Centro and Rete Sud are standalone groups (self-parented at the top group level).\nReports after balancing #The same aggregation query that previously produced broken results:\nSELECT top_group_name, group_name, COUNT(*) AS num_clients, SUM(f.revenue) AS total_revenue FROM dim_client_hierarchy d JOIN stg_clienti f ON d.client_id = f.client_id GROUP BY top_group_name, group_name ORDER BY top_group_name, group_name; TOP_GROUP_NAME GROUP_NAME NUM_CLIENTS TOTAL_REVENUE ------------------ ------------------ ----------- ------------- Blu Energia Blu Energia 1 52000.00 Gialli Utilities Gialli Utilities 1 38000.00 Gruppo Centro Gruppo Centro 1 45000.00 Holding Nazionale Consorzio Nord 2 214000.00 Holding Nazionale Gruppo Centro 1 67000.00 Rete Sud Rete Sud 2 104000.00 No NULLs. Every row has an identifiable top group and group. The totals add up.\nAnd if the business wants only \u0026ldquo;real\u0026rdquo; top groups, excluding promoted clients:\nSELECT top_group_name, COUNT(*) AS num_clients, SUM(f.revenue) AS total_revenue FROM dim_client_hierarchy d JOIN stg_clienti f ON d.client_id = f.client_id WHERE d.is_direct_client = \u0026#39;N\u0026#39; AND d.is_standalone_group = \u0026#39;N\u0026#39; GROUP BY top_group_name ORDER BY total_revenue DESC; TOP_GROUP_NAME NUM_CLIENTS TOTAL_REVENUE ------------------ ----------- ------------- Holding Nazionale 3 281000.00 The flags make everything filterable. No conditional logic in the report, no CASE WHEN, no COALESCE. The dimensional model already contains all the information needed.\nThe full drill-down #The real test of a balanced hierarchy is drill-down: from the highest level to the lowest, with no surprises.\n-- Level 1: total by Top Group SELECT top_group_name, COUNT(DISTINCT group_id) AS num_groups, COUNT(*) AS num_clients, SUM(f.revenue) AS revenue FROM dim_client_hierarchy d JOIN stg_clienti f ON d.client_id = f.client_id GROUP BY top_group_name ORDER BY revenue DESC; TOP_GROUP_NAME NUM_GROUPS NUM_CLIENTS REVENUE ------------------ ---------- ----------- ---------- Holding Nazionale 2 3 281000.00 Rete Sud 1 2 104000.00 Blu Energia 1 1 52000.00 Gruppo Centro 1 1 45000.00 Gialli Utilities 1 1 38000.00 -- Level 2: drill-down into \u0026#34;Holding Nazionale\u0026#34; SELECT group_name, COUNT(*) AS num_clients, SUM(f.revenue) AS revenue FROM dim_client_hierarchy d JOIN stg_clienti f ON d.client_id = f.client_id WHERE d.top_group_name = \u0026#39;Holding Nazionale\u0026#39; GROUP BY group_name ORDER BY revenue DESC; GROUP_NAME NUM_CLIENTS REVENUE ---------------- ----------- ---------- Consorzio Nord 2 214000.00 Gruppo Centro 1 67000.00 -- Level 3: drill-down into \u0026#34;Consorzio Nord\u0026#34; SELECT client_name, f.revenue FROM dim_client_hierarchy d JOIN stg_clienti f ON d.client_id = f.client_id WHERE d.group_name = \u0026#39;Consorzio Nord\u0026#39; ORDER BY f.revenue DESC; CLIENT_NAME REVENUE ------------------- ---------- Rossi Energia Srl 125000.00 Bianchi Gas SpA 89000.00 Three levels of drill-down, zero NULLs, zero conditional logic. The hierarchy is balanced and the numbers add up at every level.\nWhy COALESCE in reports is not enough #Someone might object: \u0026ldquo;But COALESCE in the report does the same thing, without needing to change the model.\u0026rdquo;\nNo. It does something similar, but with three fundamental differences.\nFirst: COALESCE must be repeated everywhere. Every query, every report, every dashboard, every extract. If you have twenty reports using the hierarchy, you must remember to apply the COALESCE in all twenty. And when the twenty-first arrives, you must remember again. Self-parenting in the dimensional model is done once in the ETL and that is it.\nSecond: COALESCE does not distinguish. You cannot tell whether \u0026ldquo;Gialli Utilities\u0026rdquo; in the top_group field is a real top group or a promoted client. With flags in the dimensional model you have the information to filter. Without flags, the business is blind.\nThird: performance. A GROUP BY with COALESCE on nullable columns is less efficient than a GROUP BY on NOT NULL columns. Oracle\u0026rsquo;s optimizer handles NOT NULL constrained columns better — it can eliminate NULL checks, use indexes more aggressively, and produce simpler execution plans. On a dimensional table with millions of rows, the difference shows.\nWhen to use self-parenting (and when not to) #Self-parenting works well when:\nThe hierarchy has a fixed number of levels (typically 2-5) The main use case is aggregation and drill-down in reports The model is a data warehouse or an OLAP cube Missing levels are the exception, not the rule It does not work well when:\nThe hierarchy is recursive with variable depth (e.g. org charts with N levels) You need to navigate the graph of relationships (e.g. social networks, supply chains) The model is OLTP and self-parenting would create ambiguity in application logic Hierarchy levels change frequently over time For recursive hierarchies with variable depth, the right approach is different: bridge tables, closure tables or parent-child models with recursive CTEs. These are powerful tools but they solve a different problem.\nSelf-parenting solves a specific problem — fixed-level hierarchies with incomplete branches — and it solves it in the simplest way possible: balancing the structure upstream, in the model, rather than downstream, in the reports.\nThe rule that guides me #I have designed dozens of hierarchical dimensions in twenty years of data warehousing. The rule I carry with me is always the same:\nIf the report needs conditional logic to handle the hierarchy, the problem is in the model, not in the report.\nA report should do GROUP BY and JOIN. If it also has to decide how to handle missing levels, it is doing the ETL\u0026rsquo;s job. And a report that does the ETL\u0026rsquo;s job is a report that will break sooner or later.\nSelf-parenting is not elegant. It is not sophisticated. It is a solution that a freshly graduated computer scientist might find ugly. But it works, it is maintainable, and it transforms a problem that infests every single report into a problem that is solved once, in one place, and never comes back.\nSometimes the best solution is the simplest one. This is one of those times.\nGlossary #COALESCE — A SQL function that returns the first non-NULL value from a list of expressions. Often used as a workaround for incomplete hierarchies in reports, but it doesn\u0026rsquo;t solve the structural problem in the dimensional model.\nDrill-down — Navigation in reports from an aggregated level to a detail level (e.g. from Top Group to Group to Client). Requires a complete and balanced hierarchy to work correctly without NULLs or missing rows.\nOLAP — Online Analytical Processing — processing oriented to multidimensional data analysis, typical of data warehouses and analysis cubes. Contrasted with OLTP (Online Transaction Processing) used in transactional systems.\nRagged hierarchy — A hierarchy where not all branches reach the same depth: some intermediate levels are missing. Common in customer master data, products and organizational structures where not all entities share the same hierarchical structure.\nSelf-parenting — A technique for balancing ragged hierarchies: an entity without a parent becomes its own parent. The missing level is filled with data from the level below, eliminating NULLs from the dimension and ensuring correct drill-down behavior.\n","date":"20 January 2026","permalink":"https://ivanluminaria.com/en/posts/data-warehouse/ragged-hierarchies/","section":"Database Strategy","summary":"\u003cp\u003eThree levels. Top Group, Group, Client. It looks like a trivial structure — the kind of hierarchy you draw on a whiteboard in five minutes and that any BI tool should handle without issues.\u003c/p\u003e\n\u003cp\u003eThen you discover that not all clients belong to a group. And that not all groups belong to a top group. And that the aggregation reports the business asks for — revenue by top group, client count by group, \u003cspan class=\"glossary-tip\" tabindex=\"0\" data-glossary-desc=\"Navigation in reports from an aggregated level to a detail level, typical of OLAP analysis and data warehouses.\" data-glossary-url=\"/en/glossary/drill-down/\" data-glossary-more=\"Read more →\"\u003edrill-down\u003c/span\u003e\n from the top to the leaf — produce wrong or incomplete results because the hierarchy has holes.\u003c/p\u003e","title":"Ragged hierarchies: when the client has no parent and the group has no grandparent"},{"content":"","date":null,"permalink":"https://ivanluminaria.com/en/tags/reporting/","section":"Tags","summary":"","title":"Reporting"},{"content":"","date":null,"permalink":"https://ivanluminaria.com/en/tags/authentication/","section":"Tags","summary":"","title":"Authentication"},{"content":"","date":null,"permalink":"https://ivanluminaria.com/en/tags/communication/","section":"Tags","summary":"","title":"Communication"},{"content":"","date":null,"permalink":"https://ivanluminaria.com/en/tags/conflict-management/","section":"Tags","summary":"","title":"Conflict-Management"},{"content":"A few weeks ago a client calls me. Pragmatic tone, seemingly trivial request:\n\u0026ldquo;I need to create a user on MySQL for an application that needs to access a database. Can you take care of it?\u0026rdquo;\nSure. CREATE USER, `GRANT` , next.\nThen he adds: \u0026ldquo;The application runs on two different servers. And sometimes we\u0026rsquo;ll also connect locally for maintenance.\u0026rdquo;\nRight. This is where it stops being trivial. Because in MySQL, creating \u0026ldquo;a user\u0026rdquo; does not mean what you think.\nMySQL\u0026rsquo;s Authentication Model: User + Host #The first thing to understand — and that many DBAs coming from Oracle or PostgreSQL learn the hard way — is that in MySQL a user\u0026rsquo;s identity is not just their name.\nIt is the pair 'user'@'host'.\nThis means that:\n\u0026#39;mario\u0026#39;@\u0026#39;localhost\u0026#39; \u0026#39;mario\u0026#39;@\u0026#39;192.168.1.10\u0026#39; \u0026#39;mario\u0026#39;@\u0026#39;%\u0026#39; are not the same user. They are three different users. With different passwords, different privileges, different behaviors.\nWhen MySQL receives a connection, it looks at two things:\nThe username provided The IP address (or hostname) from which the connection originates Then it searches the mysql.user table for the row that matches the most specific pair. Not the first one found. The most specific one.\nWhy This Model? #The design choice is not random. MySQL was born in 1995 for the web. Environments where the same database serves applications running on different machines, different networks, with different access requirements.\nThe user@host model allows you to:\ngrant full access from localhost (for the DBA) grant limited access from a specific application server block everything else No firewall. No VPN. Directly in the authentication engine.\nIt is a powerful model. But if you don\u0026rsquo;t understand it, it bites.\nThe Client\u0026rsquo;s Case: How I Solved It #Back to the request. The application runs on two servers (192.168.1.20 and 192.168.1.21) and local access for maintenance is also needed.\nThe temptation is to create a single user with '%' (wildcard = any host):\nCREATE USER \u0026#39;app_sales\u0026#39;@\u0026#39;%\u0026#39; IDENTIFIED BY \u0026#39;SecurePassword#2026\u0026#39;; GRANT SELECT, INSERT, UPDATE ON sales_db.* TO \u0026#39;app_sales\u0026#39;@\u0026#39;%\u0026#39;; Does it work? Yes. Is it correct? No.\nThe problem with '%' is that it accepts connections from any IP. If someone finds the password tomorrow, they can connect from anywhere in the network. Or the world, if the database is exposed.\nThe correct solution is to create specific users for each source:\n-- Access from the primary application server CREATE USER \u0026#39;app_sales\u0026#39;@\u0026#39;192.168.1.20\u0026#39; IDENTIFIED BY \u0026#39;SecurePassword#2026\u0026#39;; GRANT SELECT, INSERT, UPDATE ON sales_db.* TO \u0026#39;app_sales\u0026#39;@\u0026#39;192.168.1.20\u0026#39;; -- Access from the secondary application server CREATE USER \u0026#39;app_sales\u0026#39;@\u0026#39;192.168.1.21\u0026#39; IDENTIFIED BY \u0026#39;SecurePassword#2026\u0026#39;; GRANT SELECT, INSERT, UPDATE ON sales_db.* TO \u0026#39;app_sales\u0026#39;@\u0026#39;192.168.1.21\u0026#39;; -- Local access for maintenance (different privileges) CREATE USER \u0026#39;app_sales\u0026#39;@\u0026#39;localhost\u0026#39; IDENTIFIED BY \u0026#39;MaintPassword#2026\u0026#39;; GRANT SELECT ON sales_db.* TO \u0026#39;app_sales\u0026#39;@\u0026#39;localhost\u0026#39;; Three users. Same name. Calibrated privileges.\nThe local user has only SELECT because it\u0026rsquo;s for checks, not for writing data. Different password because the usage context is different.\nPrinciple of least privilege . Applied at the right point.\nThe Matching Trap: Who Wins? #This is where most errors originate.\nIf both 'mario'@'%' and 'mario'@'localhost' exist, and Mario connects from localhost, which user is used?\nAnswer: 'mario'@'localhost'.\nMySQL sorts the rows in the mysql.user table from most specific to least specific:\nExact literal host (192.168.1.20) Pattern with wildcard (192.168.1.%) Full wildcard (%) And uses the first match in specificity order.\nThe classic problem is this: you create 'mario'@'%' with all privileges. Then someone creates 'mario'@'localhost' without privileges (or with a different password). From that moment, Mario can no longer log in from local and nobody understands why.\nI have seen this scenario at least a dozen times in production. The solution is always the same: check what exists before you create.\nSELECT user, host, authentication_string FROM mysql.user WHERE user = \u0026#39;mario\u0026#39;; If you don\u0026rsquo;t do it before, you\u0026rsquo;ll do it after. With more urgency and less calm.\nMySQL vs MariaDB: The Differences That Matter #The user@host model is identical between MySQL and MariaDB. But there are implementation differences worth knowing.\nDefault authentication :\nVersion Default Plugin MySQL 5.7 mysql_native_password MySQL 8.0+ caching_sha2_password MariaDB 10.x mysql_native_password If you migrate from MariaDB to MySQL 8 (or vice versa), clients might fail to connect because the authentication plugin is different. It\u0026rsquo;s not a bug. It\u0026rsquo;s a default change.\nUser creation:\nIn MySQL 8, GRANT no longer creates users implicitly. You must do CREATE USER first and GRANT after. Always.\n-- MySQL 8: correct CREATE USER \u0026#39;app\u0026#39;@\u0026#39;10.0.0.5\u0026#39; IDENTIFIED BY \u0026#39;pwd123\u0026#39;; GRANT SELECT ON mydb.* TO \u0026#39;app\u0026#39;@\u0026#39;10.0.0.5\u0026#39;; -- MySQL 5.7 / MariaDB: still works (but deprecated) GRANT SELECT ON mydb.* TO \u0026#39;app\u0026#39;@\u0026#39;10.0.0.5\u0026#39; IDENTIFIED BY \u0026#39;pwd123\u0026#39;; If you are writing provisioning scripts, this detail can break an entire CI/CD pipeline.\nRoles:\nMySQL 8.0 introduced roles. MariaDB supports them since 10.0.5, but with slightly different syntax.\n-- MySQL 8.0 CREATE ROLE \u0026#39;role_readonly\u0026#39;; GRANT SELECT ON sales_db.* TO \u0026#39;role_readonly\u0026#39;; GRANT \u0026#39;role_readonly\u0026#39; TO \u0026#39;app_sales\u0026#39;@\u0026#39;192.168.1.20\u0026#39;; SET DEFAULT ROLE \u0026#39;role_readonly\u0026#39; FOR \u0026#39;app_sales\u0026#39;@\u0026#39;192.168.1.20\u0026#39;; -- MariaDB 10.x CREATE ROLE role_readonly; GRANT SELECT ON sales_db.* TO role_readonly; GRANT role_readonly TO \u0026#39;app_sales\u0026#39;@\u0026#39;192.168.1.20\u0026#39;; SET DEFAULT ROLE role_readonly FOR \u0026#39;app_sales\u0026#39;@\u0026#39;192.168.1.20\u0026#39;; The difference looks cosmetic (quotes or not), but in automated scripts it can generate syntax errors.\nThe Anonymous User: The Ghost Nobody Invited #MySQL ships with an anonymous user : ''@'localhost'. No name, no password.\nThis user is a historical artifact from development installations. In production it is a pure security risk.\nThe anonymous user wins over 'mario'@'%' when the connection comes from localhost, because 'localhost' is more specific than '%'.\nResult: Mario connects locally, MySQL authenticates him as the anonymous user, and Mario\u0026rsquo;s privileges vanish.\nThe first thing to do on any MySQL/MariaDB production installation:\nSELECT user, host FROM mysql.user WHERE user = \u0026#39;\u0026#39;; -- If found: DROP USER \u0026#39;\u0026#39;@\u0026#39;localhost\u0026#39;; DROP USER \u0026#39;\u0026#39;@\u0026#39;%\u0026#39;; -- if it exists FLUSH PRIVILEGES ; It\u0026rsquo;s not paranoia. It\u0026rsquo;s hygiene.\nOperational Checklist #After the client experience, I formalized a checklist that I use every time I need to create users on MySQL or MariaDB:\nCheck existing users with the same name on different hosts Remove anonymous users if present Create users with specific hosts, never with '%' in production unless strictly necessary Grant only the necessary privileges — SELECT if SELECT is enough Use separate CREATE USER + GRANT (mandatory on MySQL 8) Check the authentication plugin if clients have connection issues Document the user/host pairs — in six months nobody will remember why three \u0026ldquo;app_sales\u0026rdquo; exist Conclusion #In MySQL and MariaDB a user is not a name. It is a name bound to a point of origin.\nThis model is powerful because it allows you to segment access without additional infrastructure. But it is also a source of subtle errors if you don\u0026rsquo;t understand it thoroughly.\nThe next time someone asks you to \u0026ldquo;create a user on MySQL\u0026rdquo;, before writing the first CREATE USER, ask yourself: where will they connect from?\nThe answer to that question changes everything.\nGlossary #GRANT — SQL command to assign privileges to a user or role. In MySQL 8 it no longer creates users implicitly: CREATE USER first, then GRANT.\nLeast Privilege — Security principle that prescribes assigning only the strictly necessary permissions. In MySQL it\u0026rsquo;s applied by calibrating privileges per user/host pair.\nAuthentication Plugin — Module handling credential verification. The default changes between MySQL 5.7 (mysql_native_password), MySQL 8 (caching_sha2_password) and MariaDB.\nAnonymous User — MySQL user with no name (''@'localhost') automatically created during installation. Can interfere with legitimate user matching and should be removed in production.\nFLUSH PRIVILEGES — Command that reloads grant tables into memory, making manual privilege changes effective. Needed after direct operations on the mysql.user table.\n","date":"13 January 2026","permalink":"https://ivanluminaria.com/en/posts/mysql/mysql-users-and-hosts/","section":"Database Strategy","summary":"\u003cp\u003eA few weeks ago a client calls me. Pragmatic tone, seemingly trivial request:\u003c/p\u003e\n\u003cblockquote\u003e\n\u003cp\u003e\u0026ldquo;I need to create a user on MySQL for an application that needs to access a database. Can you take care of it?\u0026rdquo;\u003c/p\u003e\n\u003c/blockquote\u003e\n\u003cp\u003eSure. \u003ccode\u003eCREATE USER\u003c/code\u003e, \u003cspan class=\"glossary-tip\" tabindex=\"0\" data-glossary-desc=\"SQL command to assign specific privileges to a user or role on databases, tables or columns. In MySQL 8 it no longer creates users implicitly.\" data-glossary-url=\"/en/glossary/grant/\" data-glossary-more=\"Read more →\"\u003e`GRANT`\u003c/span\u003e\n, next.\u003c/p\u003e","title":"MySQL Users: Why 'mario' and 'mario'@'localhost' Are Not the Same Person"},{"content":"","date":null,"permalink":"https://ivanluminaria.com/en/tags/team-leadership/","section":"Tags","summary":"","title":"Team-Leadership"},{"content":"It was a Thursday afternoon, one of those meetings that was supposed to last an hour on paper. Seven of us, connected on a call. The agenda was straightforward: decide the migration strategy for an Oracle database from on-premise to cloud.\nStraightforward, sure. On paper.\nTwenty minutes in, the meeting had turned into a duel.\n🔥 The spark #On one side was the infrastructure manager. Experienced, twenty years of datacenters behind him. His position was rock-solid: lift-and-shift migration, zero changes to the architecture, we move everything as-is.\nOn the other, the lead developer. Young, sharp, with clear ideas. He wanted to rewrite the application layer, adopt cloud-native services, containerize everything. Let\u0026rsquo;s rebuild from scratch, the code is old anyway.\nTwo legitimate positions. Two real perspectives. Two smart people.\nBut the conversation had taken a familiar — and dangerous — turn.\n\u0026ldquo;No, it makes no sense to move everything to cloud without rethinking the architecture.\u0026rdquo;\n\u0026ldquo;No, rewriting everything is a huge risk and we don\u0026rsquo;t have the budget.\u0026rdquo;\nNo. No. No.\nEvery sentence started with \u0026ldquo;no\u0026rdquo;. Every answer was a negation of the previous one. Arms crossed, tone rising, sentences getting shorter. I know that pattern. I\u0026rsquo;ve seen it hundreds of times. And I know how it ends: it doesn\u0026rsquo;t end. The meeting closes without a decision, gets rescheduled to next week, and in the meantime nobody does anything because \u0026ldquo;we haven\u0026rsquo;t decided yet\u0026rdquo;.\nThe project stalls. Not for technical reasons. For pride.\n🎭 Three words that change everything #At that point I did something very simple. I waited for a pause — because in heated discussions there\u0026rsquo;s always a moment when everyone catches their breath — and I said:\n\u0026ldquo;Marco, you\u0026rsquo;re right: moving everything to cloud without changing anything is the fastest way to get to production. And we could also identify two or three components that, during the migration, it makes sense to rethink as cloud-native. Luca, which ones would you pick?\u0026rdquo;\nNobody said \u0026ldquo;no\u0026rdquo;. Nobody was contradicted.\nMarco saw his position validated — his conservative approach was the starting point. Luca was given a concrete role — choosing what to modernize, with a clear mandate.\nIn thirty seconds, two people who were fighting found themselves collaborating on the same whiteboard.\nThe meeting ended early. With a decision. A real one.\n🧠 What the \u0026ldquo;Yes-And\u0026rdquo; technique is #What I did has a name. It\u0026rsquo;s called \u0026ldquo;Yes-And\u0026rdquo;. It comes from improvisational theatre, where there\u0026rsquo;s a fundamental rule: never deny your scene partner\u0026rsquo;s proposal.\nIf someone says \u0026ldquo;We\u0026rsquo;re on a boat in the middle of the ocean\u0026rdquo;, you don\u0026rsquo;t respond \u0026ldquo;No, we\u0026rsquo;re in an office\u0026rdquo;. You respond \u0026ldquo;Yes, and it looks like a storm is coming\u0026rdquo;. You build. You add. You move forward.\nIn project management it works the same way.\nWhen someone proposes something and you respond \u0026ldquo;No, but\u0026hellip;\u0026rdquo;, here\u0026rsquo;s what happens psychologically:\nthe other person goes on the defensive they stop listening to whatever comes after \u0026ldquo;but\u0026rdquo; they focus on how to counter-argue, not how to solve the conversation becomes a ping-pong of negations When you respond \u0026ldquo;Yes, and\u0026hellip;\u0026rdquo;, the opposite happens:\nthe other person feels acknowledged defences come down they become open to hearing your addition the conversation becomes constructive It\u0026rsquo;s not manipulation. It\u0026rsquo;s not empty diplomacy. It\u0026rsquo;s a precise technique for moving decisions forward without burning relationships.\n🛠️ How it works in daily practice #In thirty years of projects, I\u0026rsquo;ve applied \u0026ldquo;Yes-And\u0026rdquo; in dozens of situations. It works wherever there\u0026rsquo;s a decision to make and multiple people with different opinions.\nIn project meetings #Instead of: \u0026ldquo;No, the three-month timeline is unrealistic.\u0026rdquo;\nTry: \u0026ldquo;Yes, three months is the target. And to get there we\u0026rsquo;d need to cut the first release scope to these three features — the rest goes into phase two.\u0026rdquo;\nSee the difference? In the first version you have a wall. In the second you have a plan.\nIn code reviews #Instead of: \u0026ldquo;No, this approach is wrong, you wrote it in an overly complicated way.\u0026rdquo;\nTry: \u0026ldquo;Yes, it works. And we could simplify it by extracting this logic into a separate method — it becomes more testable.\u0026rdquo;\nThe developer doesn\u0026rsquo;t feel attacked. They feel helped. And next time they come to you for input before writing the code, not after.\nIn stakeholder negotiations #Instead of: \u0026ldquo;No, we can\u0026rsquo;t add that feature now, we\u0026rsquo;re already behind schedule.\u0026rdquo;\nTry: \u0026ldquo;Yes, that feature makes sense. And to include it without compromising the release date, we\u0026rsquo;d need to swap it with this other one that\u0026rsquo;s lower priority. Which of the two do you prefer?\u0026rdquo;\nThe stakeholder doesn\u0026rsquo;t hear a \u0026ldquo;no\u0026rdquo;. They hear a \u0026ldquo;yes, and now let\u0026rsquo;s decide together how to do it\u0026rdquo;.\n⚠️ When \u0026ldquo;Yes-And\u0026rdquo; doesn\u0026rsquo;t work #It would be nice to say it always works. It doesn\u0026rsquo;t. There are situations where \u0026ldquo;Yes-And\u0026rdquo; is the wrong tool.\nSecurity issues. If someone proposes removing authentication from the production database because \u0026ldquo;it slows down queries\u0026rdquo;, the answer is not \u0026ldquo;Yes, and\u0026hellip;\u0026rdquo;. The answer is \u0026ldquo;No. Full stop.\u0026rdquo;\nProcess violations. If a developer wants to deploy to production on Friday evening without tests, there\u0026rsquo;s no \u0026ldquo;Yes-And\u0026rdquo; for that. There\u0026rsquo;s a process, and it must be followed.\nNon-negotiable deadlines. When go-live is Monday and it\u0026rsquo;s Thursday, it\u0026rsquo;s not the time to build on everyone\u0026rsquo;s ideas. It\u0026rsquo;s the time to decide, execute and close.\nToxic behaviour. \u0026ldquo;Yes-And\u0026rdquo; works with people acting in good faith who have different opinions. It doesn\u0026rsquo;t work with people who just want to be right, who sabotage, who refuse to listen on principle. In those cases you need a different kind of conversation — private, direct and very frank.\nThe technique is not a magic formula. It\u0026rsquo;s a tool. And like all tools, you need to know when to use it and when to put it down.\n📊 The hidden cost of \u0026ldquo;No, but\u0026hellip;\u0026rdquo; #I tried to do a rough calculation on a project I managed two years ago. A team of eight, meetings three times a week.\nSituation Average meeting length Decisions made Before (\u0026ldquo;No, but\u0026hellip;\u0026rdquo; culture) 1h 20min 0.5 per meeting After (\u0026ldquo;Yes, and\u0026hellip;\u0026rdquo; culture) 45min 1.8 per meeting The team was making decisions three times faster and meetings lasted almost half as long.\nI don\u0026rsquo;t have scientific data. These are empirical numbers, collected on one specific project. But the pattern is consistent with what I\u0026rsquo;ve seen over twenty years: teams that discuss constructively move faster than those that argue. Not because they avoid conflict — because they get through it better.\n🎯 What I\u0026rsquo;ve learned #\u0026ldquo;Yes-And\u0026rdquo; is not diplomacy. It\u0026rsquo;s not avoiding confrontation. It\u0026rsquo;s not saying yes to everything.\nIt\u0026rsquo;s recognizing that most discussions in IT projects aren\u0026rsquo;t about who\u0026rsquo;s right. They\u0026rsquo;re about how to move things forward. And things move forward when people feel heard, not when they\u0026rsquo;re defeated.\nI\u0026rsquo;ve seen projects stall for weeks because two brilliant people couldn\u0026rsquo;t stop saying \u0026ldquo;no\u0026rdquo; to each other. And I\u0026rsquo;ve seen those same projects unblock in fifteen minutes when someone had the good sense to say \u0026ldquo;yes, and\u0026hellip;\u0026rdquo;.\nYou don\u0026rsquo;t need a communication course. You don\u0026rsquo;t need a coach. You need to try, the next time someone says something you disagree with, to respond \u0026ldquo;Yes, and\u0026hellip;\u0026rdquo; instead of \u0026ldquo;No, but\u0026hellip;\u0026rdquo;.\nA simple exercise. One that changes the way decisions are made.\nThat changes the way people work together.\nAnd that, sometimes, saves a meeting that was about to explode.\n💬 For anyone who\u0026rsquo;s been there at least once #If you\u0026rsquo;ve ever sat in a meeting where two people were talking over each other and nobody was listening to anyone. If you\u0026rsquo;ve ever seen a project stall not because of a technical problem, but because of a communication problem. If you\u0026rsquo;ve ever thought \u0026ldquo;why can\u0026rsquo;t we just decide?\u0026rdquo;\nTry \u0026ldquo;Yes-And\u0026rdquo;. Next meeting. Just once.\nIt costs nothing. It needs no approval. It doesn\u0026rsquo;t require a budget.\nIt just requires the ability to hold back for one second before saying \u0026ldquo;no\u0026rdquo; — and replace it with \u0026ldquo;yes, and\u0026hellip;\u0026rdquo;.\nThe result might surprise you.\nGlossary #Yes-And — Communication technique from improvisational theatre that replaces \u0026ldquo;No, but\u0026hellip;\u0026rdquo; with \u0026ldquo;Yes, and\u0026hellip;\u0026rdquo;, turning discussions into collaborative building.\nStakeholder — Person or group with a direct interest in a project\u0026rsquo;s outcome: client, end user, sponsor, technical team, or any party affected by project decisions.\nScope — Project perimeter that defines what is included and excluded: features, deliverables, constraints, and boundaries agreed with stakeholders.\nLift-and-Shift — Migration strategy that moves a system from one environment to another without modifying its architecture, code, or configuration.\nTimeboxing — Time management technique that assigns a fixed, non-negotiable interval to an activity, forcing conclusion within the established limit.\n","date":"13 January 2026","permalink":"https://ivanluminaria.com/en/posts/project-management/tecnica-si-e-yes-and/","section":"Database Strategy","summary":"\u003cp\u003eIt was a Thursday afternoon, one of those meetings that was supposed to last an hour on paper. Seven of us, connected on a call. The agenda was straightforward: decide the migration strategy for an Oracle database from on-premise to cloud.\u003c/p\u003e\n\u003cp\u003eStraightforward, sure. On paper.\u003c/p\u003e\n\u003cp\u003eTwenty minutes in, the meeting had turned into a duel.\u003c/p\u003e\n\u003chr\u003e\n\u003ch2 id=\"-the-spark\" class=\"relative group\"\u003e🔥 The spark \u003cspan class=\"absolute top-0 w-6 transition-opacity opacity-0 -start-6 not-prose group-hover:opacity-100\"\u003e\u003ca class=\"group-hover:text-primary-300 dark:group-hover:text-neutral-700\" style=\"text-decoration-line: none !important;\" href=\"#-the-spark\" aria-label=\"Anchor\"\u003e#\u003c/a\u003e\u003c/span\u003e\u003c/h2\u003e\u003cp\u003eOn one side was the infrastructure manager. Experienced, twenty years of datacenters behind him. His position was rock-solid: \u003cstrong\u003elift-and-shift migration, zero changes to the architecture, we move everything as-is\u003c/strong\u003e.\u003c/p\u003e","title":"The Yes-And technique: how I defused a meeting that was about to blow up"},{"content":"","date":null,"permalink":"https://ivanluminaria.com/en/tags/users/","section":"Tags","summary":"","title":"Users"},{"content":"","date":null,"permalink":"https://ivanluminaria.com/en/tags/indexes/","section":"Tags","summary":"","title":"Indexes"},{"content":"","date":null,"permalink":"https://ivanluminaria.com/en/tags/pg_trgm/","section":"Tags","summary":"","title":"Pg_trgm"},{"content":"","date":null,"permalink":"https://ivanluminaria.com/en/tags/query-tuning/","section":"Tags","summary":"","title":"Query-Tuning"},{"content":"A few weeks ago, a client contacted me with a very common issue:\n\u0026ldquo;Search in the admin console is slow. Sometimes it takes several seconds. We\u0026rsquo;ve already reduced the JOINs, but the problem hasn\u0026rsquo;t disappeared.\u0026rdquo;\nEnvironment: PostgreSQL on managed cloud.\nMain table: payment_report (~6 million rows, 3 GB).\nSearched column: reference_code.\nProblematic query:\nSELECT * FROM reporting.payment_report r JOIN reporting.payment_cart c ON c.id = r.cart_id WHERE c.service_id = 1001 AND r.reference_code LIKE \u0026#39;%ABC123%\u0026#39; ORDER BY c.created_at DESC LIMIT 100; 🧠 First observation: the JOINs were not the problem #I compared:\nAS-IS version (3 JOINs on the same table) TO-BE version (only 1 JOIN) The result?\nThe execution plan showed in both cases:\nParallel Seq Scan on payment_report Rows Removed by Filter: ~2,000,000 Buffers: shared read = hundreds of thousands Execution Time: 14–18 seconds Reducing the JOINs had only a marginal impact.\nThe real problem was something else.\n📌 The culprit: LIKE '%value%' without a proper index #A search with a leading wildcard (%value%) makes a normal B-Tree index unusable.\nPostgreSQL is forced to perform a sequential scan of the entire table.\nIn this specific case:\n~3 GB of data hundreds of thousands of 8KB pages read I/O bound workload seconds of latency This is not a matter of \u0026ldquo;bad SQL\u0026rdquo;. It is an access path problem.\n🔬 Before creating an index: risk analysis #The client rightly asked:\n\u0026ldquo;If we create a trigram (GIN) index, do we risk slowing down payment transactions?\u0026rdquo;\nThis is where a frequently ignored concept comes into play: **churn** .\nWhat is churn? #It represents how much a table changes after rows are inserted.\nHigh frequency of: - UPDATE - DELETE\n→ high churn\n→ higher index maintenance cost\n→ possible write degradation\nIn our case:\nTable payment_report: - ~12k inserts/day - 0 updates - 0 deletes - 0 dead tuples\nProfile: append-only\nThis is the best possible scenario to introduce a GIN index.\n📊 Critical check: synchronous or batch? #The table did not contain an insertion timestamp.\nSolution: indirect analysis.\nI correlated rows in payment_report with the cart timestamp (payment_cart.created_at) and analyzed hourly distribution.\nResult:\ncontinuous 24/7 pattern daytime peaks nighttime drop perfect correlation with cart traffic Conclusion: near real-time population, not nightly batch.\n🛠️ The solution #CREATE EXTENSION IF NOT EXISTS pg_trgm ; CREATE INDEX CONCURRENTLY idx_payment_report_reference_trgm ON reporting.payment_report USING gin (reference_code gin_trgm_ops); Precautions:\nCreate during an off-peak window Use CONCURRENTLY mode Monitor I/O during index build 📈 Result: the execution plan before and after #Here is the full execution plan for the query — before and after creating the trigram index.\nBefore (without trigram index):\nNested Loop Inner Join → Nested Loop Inner Join → Nested Loop Inner Join → Seq Scan on payment_report as r Filter: ((reference_code)::text ~~ \u0026#39;%ABC123%\u0026#39;::text) → Index Scan using payment_cart_pkey on payment_cart as c Filter: (service_id = 1001) Index Cond: (id = r.cart_id) → Index Only Scan using payment_cart_pkey on payment_cart as c2 Index Cond: (id = c.id) → Index Only Scan using payment_cart_pkey on payment_cart as c3 Index Cond: (id = c.id) After (with trigram index):\nNested Loop Inner Join → Nested Loop Inner Join → Nested Loop Inner Join → Bitmap Heap Scan on payment_report as r Recheck Cond: ((reference_code)::text ~~ \u0026#39;%ABC123%\u0026#39;::text) → Bitmap Index Scan using idx_payment_report_reference_trgm Index Cond: ((reference_code)::text ~~ \u0026#39;%ABC123%\u0026#39;::text) → Index Scan using payment_cart_pkey on payment_cart as c Filter: (service_id = 1001) Index Cond: (id = r.cart_id) → Index Only Scan using payment_cart_pkey on payment_cart as c2 Index Cond: (id = c.id) → Index Only Scan using payment_cart_pkey on payment_cart as c3 Index Cond: (id = c.id) The key change is at steps 4–5: the Seq Scan — which read the entire table row by row — has been replaced by a Bitmap Heap Scan driven by the trigram index idx_payment_report_reference_trgm. PostgreSQL now filters directly through the index and only rechecks the candidate rows.\nSame query, same data, but a completely different access path. From seconds to milliseconds.\n🎯 Key lesson #When a query is slow:\nDon\u0026rsquo;t stop at the number of JOINs. Look at the execution plan. Identify whether the bottleneck is CPU or I/O. Evaluate churn before introducing a GIN index. Always measure before deciding. Often the problem is not \u0026ldquo;optimizing the query\u0026rdquo;.\nIt is giving the planner the right index.\n💬 Why share this case? #Because this is an extremely common scenario:\nLarge tables \u0026ldquo;Contains\u0026rdquo; search patterns Fear of introducing GIN indexes Concern about write performance degradation With data in hand, the decision becomes technical, not emotional.\nOptimization is not magic.\nIt is measurement, plan analysis, and understanding real system behavior.\nGlossary #GIN Index — Generalized Inverted Index: PostgreSQL index type that creates an inverted mapping from each element to the records containing it. Ideal for \u0026ldquo;contains\u0026rdquo; searches on text with pg_trgm.\nB-Tree — Balanced tree data structure, the default index in relational databases. Efficient for equality and range searches, but unusable for LIKE '%value%'.\npg_trgm — PostgreSQL extension that decomposes text into trigrams (3-character sequences), enabling GIN indexes to accelerate wildcard searches.\nChurn — Measure of how much a table changes after insertion. Low churn (append-only) is the best scenario for introducing a GIN index without degrading writes.\nExecution Plan — Sequence of operations chosen by the database to resolve a query. Reading the plan is the first step to identify whether the problem is CPU, I/O or a wrong access path.\n","date":"6 January 2026","permalink":"https://ivanluminaria.com/en/posts/postgresql/like-optimization-postgresql/","section":"Database Strategy","summary":"\u003cp\u003eA few weeks ago, a client contacted me with a very common issue:\u003c/p\u003e\n\u003cblockquote\u003e\n\u003cp\u003e\u0026ldquo;Search in the admin console is slow. Sometimes it takes several\nseconds. We\u0026rsquo;ve already reduced the JOINs, but the problem hasn\u0026rsquo;t\ndisappeared.\u0026rdquo;\u003c/p\u003e\n\u003c/blockquote\u003e\n\u003cp\u003eEnvironment: PostgreSQL on managed cloud.\u003cbr\u003e\nMain table: \u003ccode\u003epayment_report\u003c/code\u003e (~6 million rows, 3 GB).\u003cbr\u003e\nSearched column: \u003ccode\u003ereference_code\u003c/code\u003e.\u003c/p\u003e\n\u003cp\u003eProblematic query:\u003c/p\u003e\n\u003cdiv class=\"highlight\"\u003e\u003cpre tabindex=\"0\" class=\"chroma\"\u003e\u003ccode class=\"language-sql\" data-lang=\"sql\"\u003e\u003cspan class=\"line\"\u003e\u003cspan class=\"cl\"\u003e\u003cspan class=\"k\"\u003eSELECT\u003c/span\u003e\u003cspan class=\"w\"\u003e \u003c/span\u003e\u003cspan class=\"o\"\u003e*\u003c/span\u003e\u003cspan class=\"w\"\u003e\n\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003cspan class=\"line\"\u003e\u003cspan class=\"cl\"\u003e\u003cspan class=\"k\"\u003eFROM\u003c/span\u003e\u003cspan class=\"w\"\u003e \u003c/span\u003e\u003cspan class=\"n\"\u003ereporting\u003c/span\u003e\u003cspan class=\"p\"\u003e.\u003c/span\u003e\u003cspan class=\"n\"\u003epayment_report\u003c/span\u003e\u003cspan class=\"w\"\u003e \u003c/span\u003e\u003cspan class=\"n\"\u003er\u003c/span\u003e\u003cspan class=\"w\"\u003e\n\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003cspan class=\"line\"\u003e\u003cspan class=\"cl\"\u003e\u003cspan class=\"k\"\u003eJOIN\u003c/span\u003e\u003cspan class=\"w\"\u003e \u003c/span\u003e\u003cspan class=\"n\"\u003ereporting\u003c/span\u003e\u003cspan class=\"p\"\u003e.\u003c/span\u003e\u003cspan class=\"n\"\u003epayment_cart\u003c/span\u003e\u003cspan class=\"w\"\u003e \u003c/span\u003e\u003cspan class=\"k\"\u003ec\u003c/span\u003e\u003cspan class=\"w\"\u003e \u003c/span\u003e\u003cspan class=\"k\"\u003eON\u003c/span\u003e\u003cspan class=\"w\"\u003e \u003c/span\u003e\u003cspan class=\"k\"\u003ec\u003c/span\u003e\u003cspan class=\"p\"\u003e.\u003c/span\u003e\u003cspan class=\"n\"\u003eid\u003c/span\u003e\u003cspan class=\"w\"\u003e \u003c/span\u003e\u003cspan class=\"o\"\u003e=\u003c/span\u003e\u003cspan class=\"w\"\u003e \u003c/span\u003e\u003cspan class=\"n\"\u003er\u003c/span\u003e\u003cspan class=\"p\"\u003e.\u003c/span\u003e\u003cspan class=\"n\"\u003ecart_id\u003c/span\u003e\u003cspan class=\"w\"\u003e\n\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003cspan class=\"line\"\u003e\u003cspan class=\"cl\"\u003e\u003cspan class=\"k\"\u003eWHERE\u003c/span\u003e\u003cspan class=\"w\"\u003e \u003c/span\u003e\u003cspan class=\"k\"\u003ec\u003c/span\u003e\u003cspan class=\"p\"\u003e.\u003c/span\u003e\u003cspan class=\"n\"\u003eservice_id\u003c/span\u003e\u003cspan class=\"w\"\u003e \u003c/span\u003e\u003cspan class=\"o\"\u003e=\u003c/span\u003e\u003cspan class=\"w\"\u003e \u003c/span\u003e\u003cspan class=\"mi\"\u003e1001\u003c/span\u003e\u003cspan class=\"w\"\u003e\n\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003cspan class=\"line\"\u003e\u003cspan class=\"cl\"\u003e\u003cspan class=\"w\"\u003e  \u003c/span\u003e\u003cspan class=\"k\"\u003eAND\u003c/span\u003e\u003cspan class=\"w\"\u003e \u003c/span\u003e\u003cspan class=\"n\"\u003er\u003c/span\u003e\u003cspan class=\"p\"\u003e.\u003c/span\u003e\u003cspan class=\"n\"\u003ereference_code\u003c/span\u003e\u003cspan class=\"w\"\u003e \u003c/span\u003e\u003cspan class=\"k\"\u003eLIKE\u003c/span\u003e\u003cspan class=\"w\"\u003e \u003c/span\u003e\u003cspan class=\"s1\"\u003e\u0026#39;%ABC123%\u0026#39;\u003c/span\u003e\u003cspan class=\"w\"\u003e\n\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003cspan class=\"line\"\u003e\u003cspan class=\"cl\"\u003e\u003cspan class=\"k\"\u003eORDER\u003c/span\u003e\u003cspan class=\"w\"\u003e \u003c/span\u003e\u003cspan class=\"k\"\u003eBY\u003c/span\u003e\u003cspan class=\"w\"\u003e \u003c/span\u003e\u003cspan class=\"k\"\u003ec\u003c/span\u003e\u003cspan class=\"p\"\u003e.\u003c/span\u003e\u003cspan class=\"n\"\u003ecreated_at\u003c/span\u003e\u003cspan class=\"w\"\u003e \u003c/span\u003e\u003cspan class=\"k\"\u003eDESC\u003c/span\u003e\u003cspan class=\"w\"\u003e\n\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003cspan class=\"line\"\u003e\u003cspan class=\"cl\"\u003e\u003cspan class=\"k\"\u003eLIMIT\u003c/span\u003e\u003cspan class=\"w\"\u003e \u003c/span\u003e\u003cspan class=\"mi\"\u003e100\u003c/span\u003e\u003cspan class=\"p\"\u003e;\u003c/span\u003e\u003cspan class=\"w\"\u003e\n\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/code\u003e\u003c/pre\u003e\u003c/div\u003e\u003chr\u003e\n\u003ch2 id=\"-first-observation-the-joins-were-not-the-problem\" class=\"relative group\"\u003e🧠 First observation: the JOINs were not the problem \u003cspan class=\"absolute top-0 w-6 transition-opacity opacity-0 -start-6 not-prose group-hover:opacity-100\"\u003e\u003ca class=\"group-hover:text-primary-300 dark:group-hover:text-neutral-700\" style=\"text-decoration-line: none !important;\" href=\"#-first-observation-the-joins-were-not-the-problem\" aria-label=\"Anchor\"\u003e#\u003c/a\u003e\u003c/span\u003e\u003c/h2\u003e\u003cp\u003eI compared:\u003c/p\u003e","title":"When a LIKE '%value%' Slows Everything Down: A Real PostgreSQL Optimization Case"},{"content":"The story I\u0026rsquo;m about to tell is true. I won\u0026rsquo;t name names — not out of diplomacy, but because names don\u0026rsquo;t matter. What matters is understanding the mechanism. Because this mechanism repeats itself, identically, in dozens of companies. And it costs millions.\n🏢 The client: an insurance group with a legitimate ambition #A solid company in the insurance sector. Operations in Italy, France, Northern European countries, Spain. Thousands of employees, millions of policies under management, a growing business.\nAt some point, the board makes a reasonable decision: we need a custom management system. A system that reflects our processes, our business rules, the regulatory specificities of every country where we operate.\nA legitimate decision. Sensible. Even strategic.\nThe problem isn\u0026rsquo;t the decision.\nThe problem is who they entrust it to.\n💰 Act one: the big multinational (2013–2018) #One of the Big Names in global IT consulting is brought in — full outsourcing . A name everyone knows. Thousands of consultants, offices on every continent, PowerPoint presentations that could bring you to tears.\nThe project kicks off. Requirements are defined. The budget is estimated. Contracts are signed.\nMonths pass. Then years.\nDeliverables arrive — on paper. But the software doesn\u0026rsquo;t work as expected. Specifications change. Costs balloon. Consultants rotate: the one who understood the domain leaves, a new one arrives and starts from scratch. The classic pattern of fixed-scope consulting that becomes, in practice, open-ended engagement.\nFrom 2013 to 2018: over 2.5 million euros spent.\nResult: incomplete, unstable software that no one internally knew how to maintain.\nBecause they had written the code. With their conventions. With their architecture. And when they left, they took the knowledge with them.\n🔄 Act two: let\u0026rsquo;s change supplier (2018–2022) #Management, burned but not defeated, decides to switch. \u0026ldquo;The problem was the supplier,\u0026rdquo; they think. \u0026ldquo;Let\u0026rsquo;s get a better one.\u0026rdquo;\nEnter another multinational. Equally famous. Equally large. Equally expensive.\nNew kickoff. New requirements analysis — because obviously they can\u0026rsquo;t build on the previous supplier\u0026rsquo;s work. New slides. New promises.\nAnd history repeats itself.\nSame problems, different actors. Consultant turnover. Loss of know-how. Timelines stretching. Budgets exploding. Endless meetings discussing milestones that never arrive.\nFrom 2018 to 2022: another 1.5 million euros.\nResult: another piece of software that fails to meet business needs.\nTotal invested over nearly a decade: over 4 million euros.\nWorking software: zero.\n📊 Let\u0026rsquo;s tally up the disaster # Period Supplier Investment Result 2013 – 2018 Multinational A ~€2,500,000 Incomplete software, abandoned 2018 – 2022 Multinational B ~€1,500,000 Inadequate software, abandoned Total ~€4,000,000+ No software in production Four million euros. Nearly ten years of project work. Two of the most prestigious names in global IT consulting.\nAnd in the end, the company finds itself exactly where it started.\nIt\u0026rsquo;s not bad luck. It\u0026rsquo;s a pattern.\nAnd anyone who\u0026rsquo;s worked in this industry for thirty years, like me, recognises it at first glance.\n🧠 Why it happens: the anatomy of failure #This kind of failure isn\u0026rsquo;t an accident. It\u0026rsquo;s the predictable outcome of a business model with a structural flaw.\n1. The incentive is wrong.\nA large consulting firm makes money by selling man-days. The longer the project lasts, the more it bills. There\u0026rsquo;s no real incentive to finish the project quickly and well. There\u0026rsquo;s an incentive to keep it alive as long as possible.\n2. Turnover is endemic.\nMajor consulting multinationals have annual turnover rates of 15–25%. In a project lasting five years, the team is completely renewed at least twice. Each time you start over: new learning curve, new interpretation of requirements, new mistakes.\n3. Know-how walks out the door (vendor lock-in ).\nWhen the supplier finishes (or gets fired), the system knowledge leaves with them. The client is left with software they don\u0026rsquo;t understand, can\u0026rsquo;t maintain, and can\u0026rsquo;t evolve.\n4. Specifications become a weapon (scope creep ).\nIn a custom project of this scale, specifications are always incomplete — because the business is complex and evolving. This becomes the perfect alibi: \u0026ldquo;the software doesn\u0026rsquo;t work because the specifications changed.\u0026rdquo; And it\u0026rsquo;s always someone else\u0026rsquo;s fault.\n✅ The turning point: buy, don\u0026rsquo;t build #In the end, after nearly a decade and over 4 million burned, the company makes the decision it should have made from the start:\nBuy an existing market software product and adapt it internally to their needs.\nA commercial insurance product, battle-tested, with a stable codebase and a support community. And an internal team — people who know the business, who stay with the company, who accumulate knowledge instead of dispersing it — tasked with customising and evolving it.\nCost? A fraction of what was spent in the previous ten years.\nResult? A system that works. That evolves. That the company truly owns.\nThe lesson is brutal in its simplicity:\nNot everything needs to be built from scratch. And above all, not everything should be delegated to those who have no interest in finishing.\n🏗️ The comparison that hurts: our Data Warehouse #And here\u0026rsquo;s the part of the story I know from the inside. Because for the same company, during the same period, a colleague and I built something that works. Every single day.\nA complete **Data Warehouse** . Designed, developed, deployed to production, and maintained by two people.\nNot a demo. Not a prototype. A production system that:\nLoads data every day — the entire ETL cycle runs in one and a half hours Integrates 4 different source systems — each with its own format, protocol, and quirks Collects data from 4 geographic areas: Italy, France, Northern European countries, Spain Comprises approximately 60,000 lines of code written by four hands The architecture was designed by me — from the data model to the loading strategy, from error handling to historicisation Custom management software Data Warehouse Team Two multinationals (dozens of consultants) 2 people Project duration ~10 years (and counting) 3 years Budget €4,000,000+ A fraction Lines of code Unknown (and abandoned) ~60,000 (documented, maintained) Result No software in production System in daily production Processing time — 1h 30min / day Geographic coverage — 4 countries, 4 source systems Know-how Lost with every supplier change Internal, stable, documented Two people. Three years. A system that wakes up every morning, gathers data from four corners of Europe, transforms it, loads it, and makes it available for business decisions. In an hour and a half.\nSixty thousand lines of code. Each one thought through, tested, maintained by those who wrote it.\nNo PowerPoint. No kickoff. No consultant walking away with the knowledge.\nJust competence, solid architecture, and work done right.\n🎯 The lesson #When I talk to companies about to embark on a major IT project, I always say the same thing:\nDon\u0026rsquo;t pay for a brand. Pay for the people.\nA small team of professionals who know the domain, who stay on the project, who are accountable for the result — is worth more than a hundred rotating consultants billing days.\nSoftware isn\u0026rsquo;t built with slides. It\u0026rsquo;s built with hands in the code, architecture in the mind, and responsibility on the shoulders.\nFour million euros up in smoke teach one thing:\nThe highest cost isn\u0026rsquo;t the wrong supplier you pick.\nIt\u0026rsquo;s the time you waste before realising the solution was simpler than what they sold you.\n💬 To those about to sign that contract #If your company is about to entrust a critical project to a large consulting firm, pause for a moment.\nAsk yourself:\nWho will write the code? Will they still be with the company in two years? If the supplier leaves tomorrow, could we maintain the system? Is there a market product that covers 80% of our needs? Can we build a small, competent, stable internal team? The answers to these questions are worth more than any commercial proposal.\nBecause the difference between a project that works and one that burns millions isn\u0026rsquo;t about technology.\nIt\u0026rsquo;s about the people. The continuity. The accountability.\nAnd the ability to say \u0026ldquo;no\u0026rdquo; to those who sell you complexity when the solution is simple.\nGlossary #Data Warehouse — Centralised data collection and historicisation system from diverse sources, designed for analysis and business decision support. In the case described, built by two people with 60,000 lines of code.\nETL — Extract, Transform, Load: the process of extracting data from source systems, transforming it and loading it into the data warehouse. The described DWH\u0026rsquo;s ETL cycle runs in one and a half hours.\nVendor Lock-in — Structural dependency on an external supplier that makes switching providers difficult. It establishes itself when know-how and code remain in the supplier\u0026rsquo;s hands.\nScope Creep — Uncontrolled expansion of project requirements beyond the initial scope. Incomplete specifications become the alibi for delays and additional costs.\nOutsourcing — Externalisation of IT activities to external suppliers. Risky for long-term strategic projects, where consultant turnover and know-how loss can burn millions.\n","date":"30 December 2025","permalink":"https://ivanluminaria.com/en/posts/project-management/4-milioni-nessun-software/","section":"Database Strategy","summary":"\u003cp\u003eThe story I\u0026rsquo;m about to tell is true. I won\u0026rsquo;t name names — not out of diplomacy, but because names don\u0026rsquo;t matter. What matters is understanding the mechanism. Because this mechanism repeats itself, identically, in dozens of companies. And it costs millions.\u003c/p\u003e\n\u003chr\u003e\n\u003ch2 id=\"-the-client-an-insurance-group-with-a-legitimate-ambition\" class=\"relative group\"\u003e🏢 The client: an insurance group with a legitimate ambition \u003cspan class=\"absolute top-0 w-6 transition-opacity opacity-0 -start-6 not-prose group-hover:opacity-100\"\u003e\u003ca class=\"group-hover:text-primary-300 dark:group-hover:text-neutral-700\" style=\"text-decoration-line: none !important;\" href=\"#-the-client-an-insurance-group-with-a-legitimate-ambition\" aria-label=\"Anchor\"\u003e#\u003c/a\u003e\u003c/span\u003e\u003c/h2\u003e\u003cp\u003eA solid company in the insurance sector. Operations in Italy, France, Northern European countries, Spain. Thousands of employees, millions of policies under management, a growing business.\u003c/p\u003e","title":"4 million euros, two multinationals, zero software: the true story of a failure foretold"},{"content":"","date":null,"permalink":"https://ivanluminaria.com/en/tags/data-warehouse/","section":"Tags","summary":"","title":"Data-Warehouse"},{"content":"","date":null,"permalink":"https://ivanluminaria.com/en/tags/insurance/","section":"Tags","summary":"","title":"Insurance"},{"content":"","date":null,"permalink":"https://ivanluminaria.com/en/tags/outsourcing/","section":"Tags","summary":"","title":"Outsourcing"},{"content":"","date":null,"permalink":"https://ivanluminaria.com/en/tags/execution-plan/","section":"Tags","summary":"","title":"Execution-Plan"},{"content":"Two billion rows. You do not reach that number in a day. It takes years of transactions, movements, daily records piling up. And for all that time the database works, queries respond, reports come out. Then one day someone opens a ticket: \u0026ldquo;the monthly report takes four hours.\u0026rdquo;\nFour hours. For a report that six months earlier took twenty minutes.\nIt is not a bug. It is not a network issue or slow storage. It is the physics of data: when a table grows beyond a certain threshold, the approaches that worked stop working. And if you did not design the structure to handle that growth, the database does the only thing it can: read everything.\nThe context: telecoms and industrial volumes #The client was a telecom operator. Nothing exotic — a classic Oracle 19c Enterprise Edition environment on Linux, SAN storage, about thirty instances across production, staging and development. The critical instance was billing: invoicing, CDR (Call Detail Records), accounting movements.\nThe table at the centre of the problem was called TXN_MOVIMENTI. It collected every single transaction from the billing system since 2016. The structure was roughly this:\nCREATE TABLE txn_movimenti ( txn_id NUMBER(18) NOT NULL, data_movimento DATE NOT NULL, cod_cliente VARCHAR2(20) NOT NULL, tipo_movimento VARCHAR2(10) NOT NULL, importo NUMBER(15,4), canale VARCHAR2(30), stato VARCHAR2(5) DEFAULT \u0026#39;ATT\u0026#39;, data_insert TIMESTAMP DEFAULT SYSTIMESTAMP, CONSTRAINT pk_txn_movimenti PRIMARY KEY (txn_id) ); 2.1 billion rows. 380 GB of data. A single segment, a single tablespace, no partitions. A monolith.\nThe indexes were there: one on the primary key, one on data_movimento, one composite on (cod_cliente, data_movimento). But when a table exceeds a certain size, even an index range scan is no longer enough, because the volume of data returned is still enormous.\nThe symptoms: it is not slowness, it is collapse #The problems did not show up all at once. They arrived gradually, as always happens with tables that grow without control.\nFirst signal: monthly reports. The aggregate billing query — summing amounts by customer for a given month — had gone from 20 minutes to 4 hours over the course of a year. The execution plan showed an index range scan on the date, but the number of blocks read was monstrous: Oracle had to traverse hundreds of thousands of index leaf blocks and then do table access by rowid to retrieve the columns not covered by the index.\nSecond signal: maintenance. ALTER INDEX REBUILD on the date index took six hours. Statistics collection (DBMS_STATS.GATHER_TABLE_STATS) would not finish overnight. RMAN backups had become a gamble: sometimes they fit in the window, sometimes not.\nThird signal: involuntary full table scans. Queries with date predicates that the optimizer chose to resolve with a full table scan because the estimated cost of the index scan was higher. On 380 GB of data.\nThe execution plan for the billing query looked like this:\nSELECT cod_cliente, TRUNC(data_movimento, \u0026#39;MM\u0026#39;) AS mese, SUM(importo) AS totale FROM txn_movimenti WHERE data_movimento BETWEEN DATE \u0026#39;2025-01-01\u0026#39; AND DATE \u0026#39;2025-01-31\u0026#39; AND stato = \u0026#39;CON\u0026#39; GROUP BY cod_cliente, TRUNC(data_movimento, \u0026#39;MM\u0026#39;); --------------------------------------------------------------------- | Id | Operation | Name | Rows | Cost | --------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 125K | 890K | | 1 | HASH GROUP BY | | 125K | 890K | | 2 | TABLE ACCESS BY INDEX ROWID| TXN_MOVIMENTI | 28M | 885K | |* 3 | INDEX RANGE SCAN | IDX_TXN_DATA | 28M | 85K | --------------------------------------------------------------------- 28 million rows for January alone. The index found the rows, but then Oracle had to fetch each individual row from the table to read cod_cliente, importo and stato. Millions of random I/O operations on a 380 GB table scattered across thousands of blocks.\nThe solution: you do not need a better index, you need a different structure #I spent two days analysing access patterns before proposing any solution. Because partitioning is not a magic wand — if you get the partition key wrong, you make things worse.\nThe patterns were clear:\n90% of queries had a predicate on the date (data_movimento) Reports were always monthly or quarterly Operational queries (single customer) always used cod_cliente + data_movimento Data older than 3 years was never read by reports, only by annual archival batches The choice fell on monthly interval partitioning on the data_movimento column. Not classic range partitioning, where you have to manually create each future partition. Interval: you define the interval once and Oracle creates partitions automatically when data arrives for a new period.\nThe implementation: CTAS, local indexes and zero downtime (almost) #You cannot do ALTER TABLE ... PARTITION BY on an existing table with 2 billion rows. Not in Oracle 19c, at least not without Online Table Redefinition. And that option, on a table this size, has its own risks.\nI chose the CTAS approach — Create Table As Select — with parallelism. Create the new partitioned table, copy the data, rename.\nStep 1: create the partitioned table #CREATE TABLE txn_movimenti_part PARTITION BY RANGE (data_movimento) INTERVAL (NUMTOYMINTERVAL(1, \u0026#39;MONTH\u0026#39;)) ( PARTITION p_before_2016 VALUES LESS THAN (DATE \u0026#39;2016-01-01\u0026#39;), PARTITION p_2016_01 VALUES LESS THAN (DATE \u0026#39;2016-02-01\u0026#39;), PARTITION p_2016_02 VALUES LESS THAN (DATE \u0026#39;2016-03-01\u0026#39;) -- Oracle will automatically create subsequent partitions ) TABLESPACE ts_billing_data NOLOGGING PARALLEL 8 AS SELECT /*+ PARALLEL(t, 8) */ txn_id, data_movimento, cod_cliente, tipo_movimento, importo, canale, stato, data_insert FROM txn_movimenti t; `NOLOGGING` is essential: without it the copy generates redo log for every block written. On 380 GB that would mean filling the redo area and putting the system into archivelog mode for days. With NOLOGGING the copy took 3 and a half hours with parallelism at 8.\nAfter the copy I restored logging:\nALTER TABLE txn_movimenti_part LOGGING; And ran an RMAN backup immediately, because NOLOGGING segments are not recoverable in case of a restore.\nStep 2: local indexes #Index design on a partitioned table is different from a regular table. The key concept is: local index vs global index.\nA **local** index is partitioned with the same key as the table. Each table partition has its corresponding index partition. Advantage: maintenance operations on one partition do not touch the others.\nA global index spans all partitions. It is more efficient for queries that do not filter on the partition key, but any DDL operation on the partition (drop, truncate, split) invalidates the entire index.\n-- Primary key as global index (needed for point lookups) ALTER TABLE txn_movimenti_part ADD CONSTRAINT pk_txn_mov_part PRIMARY KEY (txn_id) USING INDEX GLOBAL; -- Local index on date (partition-aligned) CREATE INDEX idx_txn_mov_data ON txn_movimenti_part (data_movimento) LOCAL PARALLEL 8; -- Local composite index for operational queries CREATE INDEX idx_txn_mov_cli_data ON txn_movimenti_part (cod_cliente, data_movimento) LOCAL PARALLEL 8; The primary key stays global because queries by txn_id never include the date — you need direct access. The other indexes are local because they align with usage patterns: queries by date, queries by customer+date.\nStep 3: the switch #-- Rename the original table (backup) ALTER TABLE txn_movimenti RENAME TO txn_movimenti_old; -- Rename the new table ALTER TABLE txn_movimenti_part RENAME TO txn_movimenti; -- Rebuild synonyms if any -- Recompile invalidated objects BEGIN FOR obj IN (SELECT object_name, object_type FROM dba_objects WHERE status = \u0026#39;INVALID\u0026#39; AND owner = \u0026#39;BILLING\u0026#39;) LOOP BEGIN IF obj.object_type = \u0026#39;PACKAGE BODY\u0026#39; THEN EXECUTE IMMEDIATE \u0026#39;ALTER PACKAGE billing.\u0026#39; || obj.object_name || \u0026#39; COMPILE BODY\u0026#39;; ELSIF obj.object_type IN (\u0026#39;PROCEDURE\u0026#39;,\u0026#39;FUNCTION\u0026#39;,\u0026#39;VIEW\u0026#39;) THEN EXECUTE IMMEDIATE \u0026#39;ALTER \u0026#39; || obj.object_type || \u0026#39; billing.\u0026#39; || obj.object_name || \u0026#39; COMPILE\u0026#39;; END IF; EXCEPTION WHEN OTHERS THEN NULL; END; END LOOP; END; / The actual downtime was the time for the two ALTER TABLE RENAME statements: a few seconds. Everything else — data copy, index creation — happened in parallel while the system was live.\nStep 4: gather statistics #BEGIN DBMS_STATS.GATHER_TABLE_STATS( ownname =\u0026gt; \u0026#39;BILLING\u0026#39;, tabname =\u0026gt; \u0026#39;TXN_MOVIMENTI\u0026#39;, granularity =\u0026gt; \u0026#39;ALL\u0026#39;, estimate_percent =\u0026gt; DBMS_STATS.AUTO_SAMPLE_SIZE, degree =\u0026gt; 8 ); END; / The granularity =\u0026gt; 'ALL' parameter is important: it tells Oracle to gather statistics at the global, partition and subpartition level. Without it, the optimizer might make wrong decisions because it does not know the data distribution within individual partitions.\nBefore and after: the numbers #The same billing query, after partitioning:\n------------------------------------------------------------------------ | Id | Operation | Name | Rows | Cost | ------------------------------------------------------------------------ | 0 | SELECT STATEMENT | | 125K | 12K | | 1 | HASH GROUP BY | | 125K | 12K | | 2 | PARTITION RANGE SINGLE | | 28M | 11K | | 3 | TABLE ACCESS FULL | TXN_MOVIMENTI | 28M | 11K | ------------------------------------------------------------------------ Look at step 2: PARTITION RANGE SINGLE. Oracle knows that January data sits in a single partition and reads only that one. The full table scan that used to be terrifying is now a full partition scan — on about 4 GB instead of 380.\nMetric Before After Change Monthly query time 4 hours 3 minutes -98% Consistent gets 48M 580K -98.8% Physical reads 12M 95K -99.2% GATHER_TABLE_STATS time 14 hours 25 min (per partition) -97% Index rebuild time 6 hours 12 min (per partition) -97% Incremental backup size 380 GB ~4 GB/month -99% The cost went from 890K to 12K. That is not a percentage improvement — it is a different order of magnitude.\nPartition pruning: the real magic #The mechanism that makes all this possible is called **partition pruning** . It is not something you configure — Oracle does it automatically when the query predicate matches the partition key.\nBut you need to know when it works and when it does not.\nIt works with direct predicates on the partition column:\n-- Pruning active: Oracle reads only the January partition WHERE data_movimento BETWEEN DATE \u0026#39;2025-01-01\u0026#39; AND DATE \u0026#39;2025-01-31\u0026#39; -- Pruning active: Oracle reads only the specific partition WHERE data_movimento = DATE \u0026#39;2025-03-15\u0026#39; It does not work when the column is wrapped in a function:\n-- Pruning DISABLED: Oracle must read all partitions WHERE TRUNC(data_movimento) = DATE \u0026#39;2025-01-01\u0026#39; -- Pruning DISABLED: function on the column WHERE TO_CHAR(data_movimento, \u0026#39;YYYY-MM\u0026#39;) = \u0026#39;2025-01\u0026#39; -- Pruning DISABLED: arithmetic expression WHERE data_movimento + 30 \u0026gt; SYSDATE This is the most common mistake I see after a partitioning implementation: developers apply functions to the date column without realising they are disabling pruning. And the table goes back to being read in full.\nI spent half a day reviewing every application query that touched TXN_MOVIMENTI. I found eleven with TRUNC(data_movimento) in the WHERE clause. Eleven queries that would have ignored the partitioning.\nLifecycle management: drop partition #One of the most concrete advantages of partitioning is data lifecycle management. Before partitioning, archiving old data meant a DELETE of billions of rows — an operation that generates mountains of redo and undo, locks the table for hours and risks blowing up the undo tablespace.\nWith partitioning:\n-- Archive 2016 data to a read-only tablespace ALTER TABLE txn_movimenti MOVE PARTITION p_2016_01 TABLESPACE ts_archive; -- Or, if the data is no longer needed ALTER TABLE txn_movimenti DROP PARTITION p_2016_01; A DROP PARTITION on a 4 GB partition takes less than a second. It generates no undo. It generates no significant redo. It does not lock the other partitions. It is a DDL operation, not DML.\nI set up a monthly job that moved partitions older than 5 years to the archive tablespace and set them to read-only. The client recovered 120 GB of active space without deleting a single record.\nWhat I learned (and the mistakes to avoid) #After fifteen years of Oracle partitioning, I have a list of things I wish I had known earlier.\nThe partition key must match the access pattern. It sounds obvious, but I have seen tables partitioned by cod_cliente when 95% of queries filter by date. Partitioning only works if queries can prune.\nInterval partitioning is almost always better than static range. With classic range you have to manually create future partitions, which means a scheduled job or a DBA who remembers. With interval Oracle creates them on its own. One less problem.\nGlobal indexes are a trap. They work well for queries, but any DDL operation on the partition invalidates them. And rebuilding a global index on 2 billion rows takes hours. Use local indexes where possible and accept the trade-off.\nNOLOGGING is not optional for bulk operations. Without NOLOGGING, a 380 GB CTAS generates the same amount of redo. Your archivelog area will fill up, the database will go into wait, and the on-call DBA will get a phone call at 3 in the morning.\nTest pruning before going to production. Do not trust: verify with EXPLAIN PLAN that every critical query actually prunes. A single TRUNC() in the wrong predicate and you have a 380 GB full table scan.\nPartitioning does not replace indexes. It reduces the volume of data to examine, but inside the partition you still need the right indexes. A monthly partition of 28 million rows without an index is still a problem.\nWhen you need partitioning #Not every table needs partitioning. My rule of thumb:\nUnder 10 million rows: probably not Between 10 and 100 million: depends on access patterns and growth rate Over 100 million: probably yes Over a billion: you have no choice But the right time to implement it is before it becomes urgent. When the table already has 2 billion rows, the migration is a project in itself. When it has 50 million and is growing, it is an afternoon\u0026rsquo;s work.\nMy biggest mistake with partitioning? Not proposing it six months earlier, when all the signals were already there.\nGlossary #Partition Pruning — Automatic Oracle mechanism that excludes irrelevant partitions during query execution, reading only those containing data matching the predicate.\nCTAS — Create Table As Select: technique for creating a new table populated with a SELECT in a single operation. Essential for migrating billion-row tables to partitioning.\nLocal Index — Index partitioned with the same key as the table. Each partition has its own index portion, making DDL operations independent across partitions.\nNOLOGGING — Oracle mode that suppresses redo log generation during bulk operations, reducing times from days to hours. Requires immediate RMAN backup after use.\nTablespace — Logical Oracle storage unit grouping physical datafiles. In partitioning, enables moving old partitions to archive storage and managing data lifecycle.\n","date":"23 December 2025","permalink":"https://ivanluminaria.com/en/posts/oracle/oracle-partitioning/","section":"Database Strategy","summary":"\u003cp\u003eTwo billion rows. You do not reach that number in a day. It takes years of transactions, movements, daily records piling up. And for all that time the database works, queries respond, reports come out. Then one day someone opens a ticket: \u0026ldquo;the monthly report takes four hours.\u0026rdquo;\u003c/p\u003e\n\u003cp\u003eFour hours. For a report that six months earlier took twenty minutes.\u003c/p\u003e\n\u003cp\u003eIt is not a bug. It is not a network issue or slow storage. It is the physics of data: when a table grows beyond a certain threshold, the approaches that worked stop working. And if you did not design the structure to handle that growth, the database does the only thing it can: read everything.\u003c/p\u003e","title":"Oracle Partitioning: when 2 billion rows no longer fit in a query"},{"content":"","date":null,"permalink":"https://ivanluminaria.com/en/tags/partitioning/","section":"Tags","summary":"","title":"Partitioning"},{"content":"","date":null,"permalink":"https://ivanluminaria.com/en/tags/architecture/","section":"Tags","summary":"","title":"Architecture"},{"content":"","date":null,"permalink":"https://ivanluminaria.com/en/tags/data-guard/","section":"Tags","summary":"","title":"Data-Guard"},{"content":"","date":null,"permalink":"https://ivanluminaria.com/en/tags/disaster-recovery/","section":"Tags","summary":"","title":"Disaster-Recovery"},{"content":"The client was a mid-sized insurance company. Three hundred employees, an in-house management application running on Oracle 19c, a single physical server in the ground-floor server room. No replica. No standby. No disaster recovery plan.\nFor five years everything had worked. And when things work, nobody wants to spend money protecting against problems they\u0026rsquo;ve never seen.\nThe day everything stopped #On a Wednesday morning in November, at 8:47 AM, the primary data group\u0026rsquo;s disk suffered a hardware failure. Not a logical error, not a recoverable corruption. A physical failure. The RAID controller lost two disks simultaneously — one had been degraded for weeks without anyone noticing, the other gave out suddenly.\nThe database stopped. Policies couldn\u0026rsquo;t be issued. Claims couldn\u0026rsquo;t be processed. The call center told customers \u0026ldquo;technical problems, please call back later.\u0026rdquo;\nI got the call at 9:15. When I arrived on site, the sysadmin was already looking for compatible disks. He found them in the early afternoon. Between disk replacement, RAID rebuild, and database recovery from the previous night\u0026rsquo;s backup, the system was back online at 3:20 PM.\nSix and a half hours of total downtime. And the loss of all transactions from 11:00 PM the night before to 8:47 AM — roughly ten hours of data, because the backup was nightly only and archived logs weren\u0026rsquo;t being copied to another machine.\nThat evening the CEO sent an email to the entire company. The next day he called me: \u0026ldquo;What do we need to do so this never happens again?\u0026rdquo;\nThe design #The answer was simple in concept, less so in execution: they needed a second database, synchronized in real time, ready to take over if the primary failed.\nOracle Active Data Guard does exactly this. A primary database generates redo logs , and a standby receives and continuously applies them. If the primary dies, the standby becomes primary. If everything is fine, the standby can also be used in read-only mode — for reports, for backups, to offload the primary.\nI designed a two-node architecture:\nPrimary (oraprod1): the existing server, with new disks, at headquarters Standby (oraprod2): a new identical server, at the hosting provider\u0026rsquo;s data center, 12 km away The distance wasn\u0026rsquo;t random. Far enough to survive a localized event (fire, flood, prolonged power outage), close enough to allow synchronous replication without noticeable latency.\nThe configuration #Preparing the primary #The first step was verifying that the primary was in ARCHIVELOG mode with FORCE LOGGING enabled. Without these two prerequisites, Data Guard has nothing to replicate.\n-- Check archivelog mode SELECT log_mode FROM v$database; -- If needed, enable it SHUTDOWN IMMEDIATE; STARTUP MOUNT; ALTER DATABASE ARCHIVELOG; ALTER DATABASE OPEN; -- Force logging: prevents NOLOGGING operations ALTER DATABASE FORCE LOGGING; FORCE LOGGING is critical. Without it, any operation with a NOLOGGING clause — a CREATE TABLE AS SELECT, an ALTER INDEX REBUILD — won\u0026rsquo;t generate redo and creates gaps in replication. I\u0026rsquo;ve seen it happen three times in my career. After the third time, I decided FORCE LOGGING is always on, no exceptions.\nStandby redo logs #On the primary, I created standby redo logs — dedicated groups that will be used when (and if) this server becomes the standby after a switchover.\n-- Standby redo logs: n+1 relative to online redo logs -- If you have 3 online groups, create 4 standby groups ALTER DATABASE ADD STANDBY LOGFILE GROUP 4 SIZE 200M; ALTER DATABASE ADD STANDBY LOGFILE GROUP 5 SIZE 200M; ALTER DATABASE ADD STANDBY LOGFILE GROUP 6 SIZE 200M; ALTER DATABASE ADD STANDBY LOGFILE GROUP 7 SIZE 200M; The rule is n+1: if the primary has three redo log groups, the standby needs four. It\u0026rsquo;s not documented very clearly, but I learned it the hard way — with three equal groups, under heavy load the standby can stall waiting for a free group.\nNetwork configuration #The tnsnames.ora on both nodes needs to know about both the primary and the standby. The configuration is symmetrical:\nORAPROD1 = (DESCRIPTION = (ADDRESS = (PROTOCOL = TCP)(HOST = 192.168.10.1)(PORT = 1521)) (CONNECT_DATA = (SERVER = DEDICATED) (SERVICE_NAME = oraprod) ) ) ORAPROD2 = (DESCRIPTION = (ADDRESS = (PROTOCOL = TCP)(HOST = 10.0.5.1)(PORT = 1521)) (CONNECT_DATA = (SERVER = DEDICATED) (SERVICE_NAME = oraprod) ) ) The listener.ora on the standby must include a static entry for the database, because during the restore the standby isn\u0026rsquo;t open yet and the listener can\u0026rsquo;t register it dynamically:\nSID_LIST_LISTENER = (SID_LIST = (SID_DESC = (GLOBAL_DBNAME = oraprod_DGMGRL) (ORACLE_HOME = /u01/app/oracle/product/19c) (SID_NAME = oraprod) ) ) The _DGMGRL suffix is used by the Data Guard Broker to identify the instance. Without this static entry, the broker can\u0026rsquo;t connect to the standby and switchover operations fail with cryptic errors that cost you half a day.\nCreating the standby #For the initial database copy, I used an RMAN DUPLICATE over the network. No tape backup, no manual file transfers. Direct, from primary to standby:\n-- On the standby server, start the instance in NOMOUNT STARTUP NOMOUNT PFILE=\u0026#39;/u01/app/oracle/product/19c/dbs/initoraprod.ora\u0026#39;; -- From RMAN, connected to both RMAN TARGET sys/password@ORAPROD1 AUXILIARY sys/password@ORAPROD2 DUPLICATE TARGET DATABASE FOR STANDBY FROM ACTIVE DATABASE DORECOVER SPFILE SET db_unique_name=\u0026#39;oraprod_stby\u0026#39; SET log_archive_dest_2=\u0026#39;\u0026#39; SET fal_server=\u0026#39;ORAPROD1\u0026#39; NOFILENAMECHECK; NOFILENAMECHECK is used when file paths are identical on both machines — same directory structure, same naming convention. If paths differ, you need DB_FILE_NAME_CONVERT and LOG_FILE_NAME_CONVERT parameters.\nThe copy took about three hours for 400 GB over a dedicated 1 Gbps line. Not the fastest, but it\u0026rsquo;s a one-time operation.\nData Guard Broker #The Broker is the component that manages the Data Guard configuration centrally and allows switchover with a single command. Without the Broker you can do everything manually, but you don\u0026rsquo;t want to do it manually when the primary just went down and the CEO is calling every five minutes.\n-- On the primary ALTER SYSTEM SET dg_broker_start=TRUE; -- On the standby ALTER SYSTEM SET dg_broker_start=TRUE; Then, from DGMGRL on the primary:\nDGMGRL\u0026gt; CREATE CONFIGURATION dg_config AS PRIMARY DATABASE IS oraprod CONNECT IDENTIFIER IS ORAPROD1; DGMGRL\u0026gt; ADD DATABASE oraprod_stby AS CONNECT IDENTIFIER IS ORAPROD2 MAINTAINED AS PHYSICAL; DGMGRL\u0026gt; ENABLE CONFIGURATION; At that point, SHOW CONFIGURATION should return:\nConfiguration - dg_config Protection Mode: MaxPerformance Members: oraprod - Primary database oraprod_stby - Physical standby database Fast-Start Failover: DISABLED Configuration Status: SUCCESS The word you want to see is SUCCESS. Anything else means there\u0026rsquo;s a network, configuration, or permissions issue to resolve before moving forward.\nThe first switchover #Two weeks after the architecture went live, I ran the first switchover test. On a Saturday morning, with the application shut down, but with the CEO present — he wanted to see it with his own eyes.\nDGMGRL\u0026gt; SWITCHOVER TO oraprod_stby; One command. Forty-two seconds. The primary became the standby, the standby became the primary. The applications, configured with the correct service, reconnected automatically.\nDGMGRL\u0026gt; SHOW CONFIGURATION; Configuration - dg_config Protection Mode: MaxPerformance Members: oraprod_stby - Primary database oraprod - Physical standby database Fast-Start Failover: DISABLED Configuration Status: SUCCESS Then we did the switchback — returning to the original primary. Another thirty-eight seconds. Clean.\nThe CEO looked at the screen, looked at me, and said: \u0026ldquo;Forty-two seconds versus six hours. Why didn\u0026rsquo;t we do this before?\u0026rdquo;\nI didn\u0026rsquo;t give him the answer. We both knew it.\nWhat they don\u0026rsquo;t tell you #The configuration I\u0026rsquo;ve described works. But there are things that Oracle\u0026rsquo;s documentation doesn\u0026rsquo;t emphasize enough.\nThe network gap. Synchronous replication (SYNC) guarantees zero data loss but introduces latency on every commit. With 12 km and a good fiber link, the added latency was 1-2 milliseconds — acceptable. But at 100 km it would have been 5-8 ms, and on an application with thousands of commits per second, the slowdown would be noticeable. That\u0026rsquo;s why I chose MaxPerformance mode (asynchronous) as the default, accepting the theoretical possibility of losing a few seconds of transactions in case of a total disaster. For that client, losing five seconds of data was infinitely better than losing ten hours.\nThe password file. The SYS user\u0026rsquo;s password file must be identical on both primary and standby. If you change it on one and not the other, redo transport stops silently. No obvious error, just a growing gap. I discovered this after an hour of debugging on a Sunday evening.\nTemp tablespaces. The standby doesn\u0026rsquo;t replicate temporary tablespaces. If you open the standby in read-only mode for reports (Active Data Guard), you need to manually create temp tablespaces, otherwise queries with sorts or hash joins fail with errors that have nothing to do with the real problem.\n-- On the standby opened in read-only mode ALTER TABLESPACE TEMP ADD TEMPFILE SIZE 2G AUTOEXTEND ON; Patches. Primary and standby must be at the same patch level. If you apply a PSU to the primary without applying it to the standby, the redo might contain structures that the standby can\u0026rsquo;t interpret. The switchover will work, but afterward you might have silent corruption. The correct procedure is: patch the standby first, switchover, patch the old primary (now standby), switchback.\nThe numbers #Six months after implementation, the results were clear:\nMetric Before After RPO (Recovery Point Objective) ~10 hours (nightly backup) \u0026lt; 5 seconds RTO (Recovery Time Objective) 6+ hours (restore from backup) \u0026lt; 1 minute (switchover) Parallel report availability No Yes (Active Data Guard) Additional infrastructure cost — 1 server + dedicated line Switchover tests performed 0 6 (one per month) The total project cost — server, licenses, dedicated line, implementation — was roughly a quarter of what that single day of downtime had cost. Not in technical terms. In terms of policies not issued, claims not processed, customers not served.\nWhat I learned #Disaster recovery isn\u0026rsquo;t a technical problem. It\u0026rsquo;s a risk perception problem. As long as the database is running, DR is an expense. When the database stops, DR is an investment that should have been made six months earlier.\nYou can\u0026rsquo;t convince a CEO with an architectural diagram. You can only wait for the disaster to happen and then be ready with the solution. It\u0026rsquo;s cynical, but that\u0026rsquo;s how it works in ninety percent of cases.\nThe only thing you can do beforehand is document the risk, put it in writing that you flagged it, and keep the project ready in the drawer. I had proposed that project eighteen months earlier. It had been shelved with a \u0026ldquo;let\u0026rsquo;s revisit it next year.\u0026rdquo;\nNext year arrived on a Wednesday morning in November, at 8:47 AM.\nGlossary #Data Guard — Oracle technology for real-time database replication to one or more standby servers. The standby continuously receives and applies the primary\u0026rsquo;s redo logs, enabling switchover in seconds.\nRedo Log — Log files where Oracle records every data change before writing it to the datafiles. They are the foundation of recovery and Data Guard replication: without redo, none of these operations is possible.\nRPO — Recovery Point Objective. The maximum amount of data an organisation can afford to lose in a disaster, measured in time. With asynchronous Data Guard it is reduced to a few seconds.\nRTO — Recovery Time Objective. The maximum acceptable time to restore service after a failure. With Data Guard and automatic switchover, it goes from hours to under a minute.\nRMAN — Recovery Manager. Oracle\u0026rsquo;s native tool for backup, restore and recovery, including standby database creation via DUPLICATE ... FOR STANDBY FROM ACTIVE DATABASE.\n","date":"16 December 2025","permalink":"https://ivanluminaria.com/en/posts/oracle/oracle-data-guard/","section":"Database Strategy","summary":"\u003cp\u003eThe client was a mid-sized insurance company. Three hundred employees, an in-house management application running on Oracle 19c, a single physical server in the ground-floor server room. No replica. No standby. No disaster recovery plan.\u003c/p\u003e\n\u003cp\u003eFor five years everything had worked. And when things work, nobody wants to spend money protecting against problems they\u0026rsquo;ve never seen.\u003c/p\u003e\n\u003ch2 id=\"the-day-everything-stopped\" class=\"relative group\"\u003eThe day everything stopped \u003cspan class=\"absolute top-0 w-6 transition-opacity opacity-0 -start-6 not-prose group-hover:opacity-100\"\u003e\u003ca class=\"group-hover:text-primary-300 dark:group-hover:text-neutral-700\" style=\"text-decoration-line: none !important;\" href=\"#the-day-everything-stopped\" aria-label=\"Anchor\"\u003e#\u003c/a\u003e\u003c/span\u003e\u003c/h2\u003e\u003cp\u003eOn a Wednesday morning in November, at 8:47 AM, the primary data group\u0026rsquo;s disk suffered a hardware failure. Not a logical error, not a recoverable corruption. A physical failure. The RAID controller lost two disks simultaneously — one had been degraded for weeks without anyone noticing, the other gave out suddenly.\u003c/p\u003e","title":"From Single Instance to Data Guard: The Day the CEO Understood DR"},{"content":"","date":null,"permalink":"https://ivanluminaria.com/en/tags/switchover/","section":"Tags","summary":"","title":"Switchover"},{"content":"","date":null,"permalink":"https://ivanluminaria.com/en/tags/kimball/","section":"Tags","summary":"","title":"Kimball"},{"content":"","date":null,"permalink":"https://ivanluminaria.com/en/tags/scd/","section":"Tags","summary":"","title":"Scd"},{"content":"The sales director shows up at the Monday morning meeting with a simple question: \u0026ldquo;How many customers did we have in the North region last June?\u0026rdquo;\nThe DWH\u0026rsquo;s answer: silence.\nNot because the system was down, or the table was missing. The data was there, technically. But it was wrong. The DWH returned the customers currently in the North region — not the ones that were there in June. Because every night, the loading process overwrote the customer master data with current values, erasing any trace of what came before.\nA customer who was in the North region in June and moved to the Central region in September? As far as the DWH was concerned, that customer had always been in the Central region. History didn\u0026rsquo;t exist.\nThe project and the original model #The context was a data warehouse in the insurance sector — claims management and customer portfolio. The source system held a master record for each customer: name, region, assigned agent, risk class, policy type.\nThe DWH dimension was modeled like this:\nCREATE TABLE dim_customer ( customer_id NUMBER(10) NOT NULL, name VARCHAR2(100) NOT NULL, region VARCHAR2(50) NOT NULL, agent VARCHAR2(100), risk_class VARCHAR2(20), policy_type VARCHAR2(50), CONSTRAINT pk_dim_customer PRIMARY KEY (customer_id) ); The nightly ETL was a simple MERGE : if the customer exists, update all fields; if not, insert.\nMERGE INTO dim_customer d USING stg_customer s ON (d.customer_id = s.customer_id) WHEN MATCHED THEN UPDATE SET d.name = s.name, d.region = s.region, d.agent = s.agent, d.risk_class = s.risk_class, d.policy_type = s.policy_type WHEN NOT MATCHED THEN INSERT ( customer_id, name, region, agent, risk_class, policy_type ) VALUES ( s.customer_id, s.name, s.region, s.agent, s.risk_class, s.policy_type ); Simple, clean, fast. And completely wrong for a data warehouse.\nThis is what Kimball calls SCD Type 1 — Slowly Changing Dimension Type 1. Overwrite the old value with the new one. No history, no versioning. The current value erases the previous one.\nFor an OLTP system it\u0026rsquo;s perfect: you always want the current address, the updated phone number, the valid email. But a data warehouse isn\u0026rsquo;t a transactional system. A data warehouse is a time machine. And a time machine that overwrites the past is useless.\nWhat you lose with Type 1 #The sales director wasn\u0026rsquo;t the only one asking questions the DWH couldn\u0026rsquo;t answer. Here\u0026rsquo;s a sample of requests that piled up over three months:\n\u0026ldquo;How many customers moved from High risk class to Low in the last year?\u0026rdquo; — Impossible. The previous risk class no longer exists. \u0026ldquo;Has agent Rossi lost customers compared to last quarter?\u0026rdquo; — Impossible. If a customer was reassigned to agent Bianchi, there\u0026rsquo;s no trace they ever belonged to Rossi. \u0026ldquo;Did the South region\u0026rsquo;s revenue drop or did customers just relocate?\u0026rdquo; — Impossible to tell. If a 200K customer moved from South to Central, the South region\u0026rsquo;s revenue drops — but not because business is bad. The customer simply changed address. Every time the answer was the same: \u0026ldquo;The system doesn\u0026rsquo;t keep history.\u0026rdquo; Which in business language means: \u0026ldquo;We don\u0026rsquo;t know.\u0026rdquo;\nAt some point the CFO requested a quarterly analysis report comparing customer portfolio composition between Q1 and Q2. The BI team tried to build it. It took three days. The result was unreliable because Q1 data no longer existed — it had been overwritten with Q2 data. The report was comparing Q2 with Q2 dressed up as Q1.\nThat was the moment that triggered the restructuring project.\nSCD Type 2: the principle #Type 2 doesn\u0026rsquo;t overwrite. It versions.\nWhen an attribute changes, the current record is closed — it gets an end validity date — and a new record is inserted with the updated values and a new start validity date. The old record stays in the database, intact, with all the values it had when it was current.\nTo make this work you need three additional elements in the dimension table:\nA surrogate key — an identifier generated by the DWH, distinct from the source system\u0026rsquo;s natural key. This is needed because the same customer will have multiple records (one per version), so the natural key is no longer unique. Validity dates — valid_from and valid_to — defining the time interval during which each version of the record was current. A current version flag — is_current — for fast retrieval of the active version without filtering on dates. The new dimension table #CREATE TABLE dim_customer ( customer_key NUMBER(10) NOT NULL, customer_id NUMBER(10) NOT NULL, name VARCHAR2(100) NOT NULL, region VARCHAR2(50) NOT NULL, agent VARCHAR2(100), risk_class VARCHAR2(20), policy_type VARCHAR2(50), valid_from DATE NOT NULL, valid_to DATE NOT NULL, is_current CHAR(1) DEFAULT \u0026#39;Y\u0026#39; NOT NULL, CONSTRAINT pk_dim_customer PRIMARY KEY (customer_key) ); CREATE INDEX idx_dim_customer_natural ON dim_customer (customer_id, is_current); CREATE INDEX idx_dim_customer_validity ON dim_customer (customer_id, valid_from, valid_to); CREATE SEQUENCE seq_dim_customer START WITH 1 INCREMENT BY 1; The customer_key is the surrogate key — generated by the sequence, never taken from the source system. The customer_id is the natural key — used to link the different versions of the same customer.\nI set valid_to for current records to DATE '9999-12-31' — a standard convention that simplifies temporal queries. When you look for the record valid at a certain date, the filter WHERE reference_date BETWEEN valid_from AND valid_to works without special cases.\nThe ETL logic #Type 2 ETL has two phases: first close the records that changed, then insert the new versions. The order matters — if you insert before closing, there\u0026rsquo;s a moment when two \u0026ldquo;current\u0026rdquo; versions of the same customer exist.\nPhase 1: identify and close modified records #MERGE INTO dim_customer d USING ( SELECT s.customer_id, s.name, s.region, s.agent, s.risk_class, s.policy_type FROM stg_customer s JOIN dim_customer d ON s.customer_id = d.customer_id AND d.is_current = \u0026#39;Y\u0026#39; WHERE (s.region != d.region OR s.agent != d.agent OR s.risk_class != d.risk_class OR s.policy_type != d.policy_type OR s.name != d.name) ) changed ON (d.customer_id = changed.customer_id AND d.is_current = \u0026#39;Y\u0026#39;) WHEN MATCHED THEN UPDATE SET d.valid_to = TRUNC(SYSDATE) - 1, d.is_current = \u0026#39;N\u0026#39;; The WHERE clause compares every tracked attribute. If even one is different, the current record is closed: valid_to is set to yesterday and is_current becomes \u0026lsquo;N\u0026rsquo;.\nA practical note: comparison with != doesn\u0026rsquo;t handle NULLs. If agent can be NULL, you need NULL-safe comparison functions. In Oracle I use DECODE:\nWHERE DECODE(s.region, d.region, 0, 1) = 1 OR DECODE(s.agent, d.agent, 0, 1) = 1 OR DECODE(s.risk_class, d.risk_class, 0, 1) = 1 -- ... DECODE treats two NULLs as equal — exactly the behavior you need.\nPhase 2: insert new versions #INSERT INTO dim_customer ( customer_key, customer_id, name, region, agent, risk_class, policy_type, valid_from, valid_to, is_current ) SELECT seq_dim_customer.NEXTVAL, s.customer_id, s.name, s.region, s.agent, s.risk_class, s.policy_type, TRUNC(SYSDATE), DATE \u0026#39;9999-12-31\u0026#39;, \u0026#39;Y\u0026#39; FROM stg_customer s WHERE NOT EXISTS ( SELECT 1 FROM dim_customer d WHERE d.customer_id = s.customer_id AND d.is_current = \u0026#39;Y\u0026#39; ); This INSERT catches two cases: entirely new customers (who don\u0026rsquo;t exist in dim_customer) and customers whose current version was just closed in Phase 1 (who therefore no longer have a record with is_current = 'Y').\nThe valid_from is today\u0026rsquo;s date. The valid_to is \u0026ldquo;end of time\u0026rdquo; — 9999-12-31. The customer_key is generated by the sequence.\nThe data: before and after #Let\u0026rsquo;s look at a concrete example. Customer 2001 — \u0026ldquo;Alfa Insurance Ltd\u0026rdquo; — is in the North region, assigned to agent Rossi, risk class Medium.\nIn July the customer is reassigned to agent Bianchi. In October the risk class changes from Medium to High.\nWith Type 1 (the previous model), in October dim_customer contains a single row:\nCUSTOMER_ID NAME REGION AGENT RISK_CLASS ----------- -------------------- ------ ------- ---------- 2001 Alfa Insurance Ltd North Bianchi High No trace of Rossi. No trace of Medium risk class. As far as the DWH knows, this customer has always belonged to agent Bianchi with High risk class.\nWith Type 2, in October dim_customer contains three rows:\nKEY CUSTOMER_ID NAME REGION AGENT CLASS VALID_FROM VALID_TO CURRENT ---- ----------- -------------------- ------ ------- ------ ---------- ---------- ------- 1001 2001 Alfa Insurance Ltd North Rossi Medium 2025-01-15 2025-07-09 N 1002 2001 Alfa Insurance Ltd North Bianchi Medium 2025-07-10 2025-10-04 N 1003 2001 Alfa Insurance Ltd North Bianchi High 2025-10-05 9999-12-31 Y Three versions of the same customer. Each version tells a piece of the story: who the agent was, what the risk class was, and during which period. The dates don\u0026rsquo;t overlap. The is_current flag identifies the active version.\nTemporal queries #Now the sales director can get his answer.\nHow many customers in the North region in June? #SELECT COUNT(DISTINCT customer_id) AS north_customers_june FROM dim_customer WHERE region = \u0026#39;North\u0026#39; AND DATE \u0026#39;2025-06-15\u0026#39; BETWEEN valid_from AND valid_to; The query is straightforward: get all records that were valid on June 15, 2025 and filter by region. No CASE WHEN, no conditional logic, no approximations.\nCustomers who changed risk class in the last year #SELECT c1.customer_id, c1.name, c1.risk_class AS previous_class, c2.risk_class AS current_class, c1.valid_to + 1 AS change_date FROM dim_customer c1 JOIN dim_customer c2 ON c1.customer_id = c2.customer_id AND c1.valid_to + 1 = c2.valid_from WHERE c1.risk_class != c2.risk_class AND c1.valid_to \u0026gt;= ADD_MONTHS(TRUNC(SYSDATE), -12) ORDER BY change_date DESC; Two consecutive versions of the same customer, joined by transition date. If the risk class differs between the two versions, the customer changed class. The change date is the day after the previous version was closed.\nQ1 vs Q2 portfolio comparison #SELECT region, COUNT(DISTINCT CASE WHEN DATE \u0026#39;2025-03-31\u0026#39; BETWEEN valid_from AND valid_to THEN customer_id END) AS customers_q1, COUNT(DISTINCT CASE WHEN DATE \u0026#39;2025-06-30\u0026#39; BETWEEN valid_from AND valid_to THEN customer_id END) AS customers_q2 FROM dim_customer WHERE DATE \u0026#39;2025-03-31\u0026#39; BETWEEN valid_from AND valid_to OR DATE \u0026#39;2025-06-30\u0026#39; BETWEEN valid_from AND valid_to GROUP BY region ORDER BY region; A single table scan, two distinct counts filtered by date. The CFO gets his quarterly report — the real one, not the one comparing Q2 with itself.\nThe fact table and surrogate keys #A point that\u0026rsquo;s often underestimated: the fact table must use the surrogate key, not the natural key.\nCREATE TABLE fact_claim ( claim_key NUMBER(10) NOT NULL, customer_key NUMBER(10) NOT NULL, -- FK to the specific version date_key NUMBER(8) NOT NULL, amount NUMBER(15,2), claim_type VARCHAR2(50), CONSTRAINT pk_fact_claim PRIMARY KEY (claim_key), CONSTRAINT fk_fact_customer FOREIGN KEY (customer_key) REFERENCES dim_customer (customer_key) ); The customer_key in the fact points to the version of the customer that was current at the time of the claim. If a claim occurs in May, when the customer still belonged to agent Rossi, the fact points to the version with agent Rossi. If another claim occurs in September, with the customer now under agent Bianchi, the fact points to the version with agent Bianchi.\nThe result is that every fact is associated with the correct dimensional context for the moment it occurred. Query May\u0026rsquo;s claims and you see agent Rossi. Query September\u0026rsquo;s claims and you see agent Bianchi. No temporal logic in the query — the direct JOIN between fact and dimension returns the right context.\n-- Claims by agent, with the correct context at the time of the claim SELECT d.agent, COUNT(*) AS num_claims, SUM(f.amount) AS total_amount FROM fact_claim f JOIN dim_customer d ON f.customer_key = d.customer_key GROUP BY d.agent ORDER BY total_amount DESC; No temporal clause. The surrogate key JOIN does all the work.\nThe dimensions of Type 2 #The cost of Type 2 is dimension table growth. With Type 1, each customer is one row. With Type 2, each customer can have N rows — one for each tracked attribute change.\nIn the insurance project the numbers looked like this:\nMetric Value Active customers ~120,000 Tracked attributes 4 (region, agent, risk class, policy type) Average change rate ~8% of customers/year dim_customer rows after 1 year ~140,000 dim_customer rows after 3 years ~180,000 dim_customer rows after 5 years ~220,000 From 120K to 220K in five years. An 83% increase — which sounds like a lot in percentage terms but is negligible in absolute terms. 220K rows are nothing for Oracle. The query with an index on the surrogate key stays in the millisecond range.\nThe issue arises when you have millions of customers with high change rates. In that case you monitor growth, consider partitioning the dimension, and most importantly choose carefully which attributes to track. Not every attribute deserves Type 2. The customer\u0026rsquo;s phone number? Type 1, overwrite. The sales region? Type 2, because it affects revenue analysis.\nThe choice of which attributes to track with Type 2 is a business decision, not a technical one. Ask the business: \u0026ldquo;If this field changes, do you need to know what the previous value was?\u0026rdquo; If the answer is yes, it\u0026rsquo;s Type 2. If it\u0026rsquo;s no, it\u0026rsquo;s Type 1.\nWhen you don\u0026rsquo;t need Type 2 #Not every dimension needs history. I\u0026rsquo;ve seen projects where every dimension was Type 2 \u0026ldquo;just in case\u0026rdquo; — the result was a needlessly complex model, slow ETL, and nobody ever querying the history of the \u0026ldquo;payment_type\u0026rdquo; or \u0026ldquo;sales_channel\u0026rdquo; dimension.\nType 2 has a cost: ETL complexity, table growth, the need to manage surrogate keys in the fact table. It\u0026rsquo;s a cost worth paying when the business needs history. If it doesn\u0026rsquo;t, Type 1 is the right choice.\nThere are also cases where Type 2 isn\u0026rsquo;t enough. If you need to know not just what changed but also who made the change and why, then you need an audit trail — a separate table with a complete change log. Type 2 tracks versions, not causes.\nAnd for dimensions with very frequent changes — prices that change daily, scores that update hourly — Type 2 can generate unsustainable growth. In those cases you consider Type 6 (a combination of Types 1, 2 and 3) or mini-dimension approaches.\nBut for the most common case — customer master data, products, employees, locations — Type 2 is the right tool. Simple enough to implement without exotic frameworks, powerful enough to give the business back the dimension it was missing: time.\nWhat I learned #The sales director didn\u0026rsquo;t know he needed history until he needed it. And when he needed it, the DWH didn\u0026rsquo;t have it.\nThat\u0026rsquo;s the point. You don\u0026rsquo;t implement Type 2 because \u0026ldquo;it\u0026rsquo;s best practice\u0026rdquo; or because \u0026ldquo;Kimball says so in chapter 5.\u0026rdquo; You implement it because a data warehouse without history is an operational database with a star schema bolted on top. It works for current month reports, but it can\u0026rsquo;t answer the question that sooner or later someone will ask: \u0026ldquo;What was it like before?\u0026rdquo;\nThe question always comes. The only question is whether your DWH is ready to answer.\nGlossary #Surrogate key — A numeric identifier generated by the data warehouse, distinct from the source system\u0026rsquo;s natural key. In SCD Type 2 it\u0026rsquo;s essential because the same record can have multiple versions, making the natural key no longer unique.\nFact table — The central table in a star schema containing numeric measures (amounts, quantities, counts) and foreign keys to dimension tables. Each row represents a business event or transaction.\nKimball — Ralph Kimball, author of the data warehouse design methodology based on dimensional modeling, star schemas and bottom-up ETL processes. His framework classifies Slowly Changing Dimensions into types 0 through 7.\nMERGE — A SQL statement that combines INSERT and UPDATE in a single operation: if the record exists it updates it, if it doesn\u0026rsquo;t it inserts it. In Oracle it\u0026rsquo;s also known as \u0026ldquo;upsert\u0026rdquo; and is the core ETL mechanism for SCD dimensions.\nStar schema — A data model typical of data warehouses: a central fact table connected to multiple dimension tables via foreign keys. It simplifies analytical queries and optimizes aggregation performance.\n","date":"11 November 2025","permalink":"https://ivanluminaria.com/en/posts/data-warehouse/scd-tipo-2/","section":"Database Strategy","summary":"\u003cp\u003eThe sales director shows up at the Monday morning meeting with a simple question: \u0026ldquo;How many customers did we have in the North region last June?\u0026rdquo;\u003c/p\u003e\n\u003cp\u003eThe DWH\u0026rsquo;s answer: silence.\u003c/p\u003e\n\u003cp\u003eNot because the system was down, or the table was missing. The data was there, technically. But it was wrong. The DWH returned the customers currently in the North region — not the ones that were there in June. Because every night, the loading process overwrote the customer master data with current values, erasing any trace of what came before.\u003c/p\u003e","title":"SCD Type 2: the history the business didn't know it needed"},{"content":"","date":null,"permalink":"https://ivanluminaria.com/en/tags/csv-export/","section":"Tags","summary":"","title":"Csv-Export"},{"content":"","date":null,"permalink":"https://ivanluminaria.com/en/tags/multi-instance/","section":"Tags","summary":"","title":"Multi-Instance"},{"content":"The ticket said: \u0026ldquo;We need a CSV export from the orders table in the ERP database. By 2 PM.\u0026rdquo;\nIt was 11 AM. Three hours for a SELECT with INTO OUTFILE — a five-minute job, I thought. Then I opened the VPN, connected to the server and realized five minutes were not going to cut it.\nThe server was a CentOS 7 box running four MySQL instances. Four. On the same host, with four different systemd services, four different ports, four different Unix sockets, four different data directories. A setup someone had put together years earlier — probably to save on a second server — and that no one had touched or documented since.\nThe first problem was not the query. The first problem was: which of the four instances do I need to connect to?\nThe Environment: Four MySQL Instances, One Server #Multi-instance MySQL environments are not as rare as you might think. I run into them more often than I would like, especially in small and mid-sized companies where servers are scarce and applications are plenty. The logic is simple: instead of buying four servers, you buy one beefy machine and run four MySQL instances on it, each with its own database, its own port, its own configuration file.\nIt works, until you need to do maintenance. And maintenance on a multi-instance setup with no documentation is an exercise in IT archaeology.\nOn that server, the situation looked like this:\nsystemctl list-units --type=service | grep mysql mysqld.service loaded active running MySQL Server (porta 3306) mysqld-app2.service loaded active running MySQL Server (porta 3307) mysqld-reporting.service loaded active running MySQL Server (porta 3308) mysqld-legacy.service loaded active running MySQL Server (porta 3309) Four services. The names were vaguely descriptive — \u0026ldquo;app2\u0026rdquo;, \u0026ldquo;reporting\u0026rdquo;, \u0026ldquo;legacy\u0026rdquo; — but the ticket mentioned the \u0026ldquo;ERP\u0026rdquo; without specifying which instance hosted that database. No internal wiki, no README file on the server, no comments in the configuration files.\nFinding the Right Instance #The first step was figuring out which instance held the orders database. The technique is always the same: start from the systemd service, trace back to the configuration file, read the port and socket from there.\nsystemctl cat mysqld-app2.service | grep ExecStart ExecStart=/usr/sbin/mysqld --defaults-file=/etc/mysql/app2.cnf Each service pointed to a different my.cnf. I checked all four:\ngrep -E \u0026#34;^(port|socket|datadir)\u0026#34; /etc/mysql/app2.cnf port = 3307 socket = /var/run/mysqld/mysqld-app2.sock datadir = /data/mysql-app2 For each instance I noted down the port, socket and datadir. Then I did a quick round:\nmysql --socket=/var/run/mysqld/mysqld.sock -u root -p -e \u0026#34;SHOW DATABASES;\u0026#34; 2\u0026gt;/dev/null mysql --socket=/var/run/mysqld/mysqld-app2.sock -u root -p -e \u0026#34;SHOW DATABASES;\u0026#34; 2\u0026gt;/dev/null mysql --socket=/var/run/mysqld/mysqld-reporting.sock -u root -p -e \u0026#34;SHOW DATABASES;\u0026#34; 2\u0026gt;/dev/null mysql --socket=/var/run/mysqld/mysqld-legacy.sock -u root -p -e \u0026#34;SHOW DATABASES;\u0026#34; 2\u0026gt;/dev/null The gestionale_prod database was on the second instance — the one on port 3307 with socket /var/run/mysqld/mysqld-app2.sock.\nOne detail that seems trivial but makes all the difference in a multi-instance environment: when you connect to MySQL specifying only -h localhost, the client does not use TCP. It uses the default Unix socket , which almost always belongs to the primary instance on port 3306. If the database you are looking for lives on a different instance, you connect to the wrong one without even realizing it.\nConnecting and Verifying #Once I had identified the instance, I connected specifying the socket explicitly:\nmysql --socket=/var/run/mysqld/mysqld-app2.sock -u root -p First thing after login: verify you are on the right instance.\nSHOW VARIABLES LIKE \u0026#39;port\u0026#39;; +---------------+-------+ | Variable_name | Value | +---------------+-------+ | port | 3307 | +---------------+-------+ SELECT DATABASE(); USE gestionale_prod; SHOW TABLES LIKE \u0026#39;%ordini%\u0026#39;; +----------------------------------+ | Tables_in_gestionale_prod | +----------------------------------+ | ordini | | ordini_dettaglio | | ordini_storico | +----------------------------------+ Port 3307, database present, orders table right where it should be. The connection was correct.\nThe port check may look like paranoia, but it is not. In an environment with four instances, mixing up which socket points to which port is easier than you think. And you only discover the mistake when the data you export is not what you expected — or worse, when you make a change thinking you are on the test database and find out you were on production.\nThe First Attempt: INTO OUTFILE #The query was straightforward. The requester wanted the last quarter\u0026rsquo;s orders with amount, customer and date:\nSELECT o.id_ordine, o.data_ordine, c.ragione_sociale, o.importo_totale FROM ordini o JOIN clienti c ON o.id_cliente = c.id_cliente WHERE o.data_ordine \u0026gt;= \u0026#39;2025-07-01\u0026#39; ORDER BY o.data_ordine; My first instinct was to use `INTO OUTFILE` , MySQL\u0026rsquo;s native way of writing results to a file:\nSELECT o.id_ordine, o.data_ordine, c.ragione_sociale, o.importo_totale FROM ordini o JOIN clienti c ON o.id_cliente = c.id_cliente WHERE o.data_ordine \u0026gt;= \u0026#39;2025-07-01\u0026#39; ORDER BY o.data_ordine INTO OUTFILE \u0026#39;/tmp/export_ordini.csv\u0026#39; FIELDS TERMINATED BY \u0026#39;,\u0026#39; ENCLOSED BY \u0026#39;\u0026#34;\u0026#39; LINES TERMINATED BY \u0026#39;\\n\u0026#39;; MySQL\u0026rsquo;s response was blunt:\nERROR 1290 (HY000): The MySQL server is running with the --secure-file-priv option so it cannot execute this statement There it was. The wall.\nsecure-file-priv : The Directive That Blocks Everything (And Rightly So) #The secure_file_priv variable is how MySQL restricts file read and write operations. It controls where LOAD DATA INFILE, SELECT INTO OUTFILE and the LOAD_FILE() function are allowed to operate.\nSHOW VARIABLES LIKE \u0026#39;secure_file_priv\u0026#39;; +------------------+------------------------+ | Variable_name | Value | +------------------+------------------------+ | secure_file_priv | /var/lib/mysql-files/ | +------------------+------------------------+ This variable has three modes:\nA specific path (e.g. /var/lib/mysql-files/): file operations work, but only within that directory Empty string (\u0026quot;\u0026quot;): no restrictions — MySQL can read and write anywhere the system user has permissions NULL: file operations are completely disabled My instance was configured with a specific path. The attempt to write to /tmp/ was blocked because /tmp/ is not /var/lib/mysql-files/.\nThe first reaction — one I see many people have — would be: \u0026ldquo;let\u0026rsquo;s change secure-file-priv to an empty string in my.cnf and restart.\u0026rdquo; No. Absolutely not. On a production server with four MySQL instances, restarting an instance at 11:30 in the morning for a CSV export is not an option. And disabling a security protection is never the right answer, not even in an emergency.\nThe obvious alternative was to write the file to the authorized directory:\nSELECT o.id_ordine, o.data_ordine, c.ragione_sociale, o.importo_totale FROM ordini o JOIN clienti c ON o.id_cliente = c.id_cliente WHERE o.data_ordine \u0026gt;= \u0026#39;2025-07-01\u0026#39; ORDER BY o.data_ordine INTO OUTFILE \u0026#39;/var/lib/mysql-files/export_ordini.csv\u0026#39; FIELDS TERMINATED BY \u0026#39;,\u0026#39; ENCLOSED BY \u0026#39;\u0026#34;\u0026#39; LINES TERMINATED BY \u0026#39;\\n\u0026#39;; But there was another problem. The /var/lib/mysql-files/ directory belonged to the primary instance (port 3306). The instance on port 3307 had its own separate datadir under /data/mysql-app2/, and its secure_file_priv pointed to /data/mysql-app2/files/ — a directory that did not exist and that nobody had ever created.\nI could have created the directory, assigned the correct permissions to the mysql user and written there. But at that point I was already losing time. And there is a cleaner way.\nThe Solution: Shell Export with the mysql Client #When INTO OUTFILE is blocked or inconvenient, the most practical solution is to bypass MySQL\u0026rsquo;s file-writing mechanism entirely and use the command-line client to redirect the output.\nThe trick is in the -B (batch mode) and -e (execute) options:\nmysql --socket=/var/run/mysqld/mysqld-app2.sock \\ -u root -p \\ -B -e \u0026#34; SELECT o.id_ordine, o.data_ordine, c.ragione_sociale, o.importo_totale FROM ordini o JOIN clienti c ON o.id_cliente = c.id_cliente WHERE o.data_ordine \u0026gt;= \u0026#39;2025-07-01\u0026#39; ORDER BY o.data_ordine \u0026#34; gestionale_prod \u0026gt; /tmp/export_ordini.tsv The -B option produces tab-separated output without the ASCII table borders. The result is a clean TSV file that opens without issues in any spreadsheet application.\nIf you need an actual CSV with commas as separators, just pipe through sed:\nmysql --socket=/var/run/mysqld/mysqld-app2.sock \\ -u root -p \\ -B -N -e \u0026#34; SELECT o.id_ordine, o.data_ordine, c.ragione_sociale, o.importo_totale FROM ordini o JOIN clienti c ON o.id_cliente = c.id_cliente WHERE o.data_ordine \u0026gt;= \u0026#39;2025-07-01\u0026#39; ORDER BY o.data_ordine \u0026#34; gestionale_prod | sed \u0026#39;s/\\t/,/g\u0026#39; \u0026gt; /tmp/export_ordini.csv The -N option removes the header row with column names. If you want it, drop the flag.\nThe file was ready in under a minute. 12,400 rows, 1.2 MB. I copied it to my machine with scp, checked it opened correctly in LibreOffice Calc and sent it to the requester. It was 11:45. The ticket that was supposed to take five minutes had taken forty-five — but at least I had not restarted any instances.\nWhy You Should Not Disable secure-file-priv #The temptation to set secure_file_priv = \u0026quot;\u0026quot; is strong, especially on development servers or on machines where \u0026ldquo;it\u0026rsquo;s just us anyway.\u0026rdquo; The problem is that protection exists for a very specific reason.\nWithout secure_file_priv, a MySQL user with the FILE privilege can:\nRead any file readable by the mysql system user — including /etc/passwd, configuration files, SSH keys if permissions are not locked down Write files anywhere the mysql user has write access — including the webroot of an Apache or Nginx running on the same server In a SQL injection scenario, the FILE privilege combined with an empty secure_file_priv is an open door. The attacker can read system files, write web shells, escalate privileges. This is not theory — it is one of the most well-documented attack vectors in penetration tests against web applications backed by MySQL.\nThe rule is simple: configure secure_file_priv with a specific path, create the necessary directories for each instance at setup time, and leave them there. If you need to do occasional exports, the mysql command-line client does the same job without touching the security configuration.\nLessons from a Five-Minute Ticket #That ticket reminded me of three things that in thirty years of working with databases I have seen confirmed hundreds of times.\nThe first: in a multi-instance environment, the first step is always identifying the instance. It sounds obvious, but the number of mistakes that come from connecting to the wrong instance — thinking you are somewhere else — is staggering. A SHOW VARIABLES LIKE 'port' after every connection is not paranoia, it is operational hygiene.\nThe second: secure-file-priv is not an obstacle, it is a safeguard. When it blocks you, that is not the moment to disable it. That is the moment to use an alternative path or an alternative method. The directive exists because MySQL in the hands of a user with the FILE privilege and no filesystem restrictions is a real risk.\nThe third: the mysql command-line client is more powerful than most DBAs give it credit for. With -B, -N, -e and a pipe to sed or awk, you can do exports, transformations and automations without ever touching INTO OUTFILE. Less elegant, maybe. But it always works, requires no special permissions and does not need someone to have created the right directory six months earlier.\nThe CSV arrived at 11:45. The requester never knew that behind five columns and 12,400 rows there were forty-five minutes of system archaeology. But that is how tickets work: the person who opens them sees the result, the person who resolves them sees the journey.\nGlossary #secure-file-priv — MySQL security directive that limits the directories where the server can read and write files via INTO OUTFILE, LOAD DATA INFILE and LOAD_FILE().\nUnix Socket — Local inter-process communication mechanism on Linux systems, used by MySQL as the default connection method when connecting to localhost.\nINTO OUTFILE — MySQL SQL clause for exporting query results directly to a file on the server\u0026rsquo;s filesystem. Subject to secure-file-priv restrictions.\nsystemd — Modern Linux service manager, used to manage multiple MySQL instances on the same server through separate unit files.\nSQL Injection — Attack technique that inserts malicious SQL code into application inputs. The secure-file-priv directive helps mitigate its impact.\n","date":"4 November 2025","permalink":"https://ivanluminaria.com/en/posts/mysql/mysql-multi-istanza-secure-file-priv/","section":"Database Strategy","summary":"\u003cp\u003eThe ticket said: \u0026ldquo;We need a CSV export from the orders table in the ERP database. By 2 PM.\u0026rdquo;\u003c/p\u003e\n\u003cp\u003eIt was 11 AM. Three hours for a SELECT with INTO OUTFILE — a five-minute job, I thought. Then I opened the VPN, connected to the server and realized five minutes were not going to cut it.\u003c/p\u003e\n\u003cp\u003eThe server was a CentOS 7 box running four MySQL instances. Four. On the same host, with four different \u003cspan class=\"glossary-tip\" tabindex=\"0\" data-glossary-desc=\"Linux init system and service manager, used to manage multiple MySQL/MariaDB instances on the same server through separate unit files.\" data-glossary-url=\"/en/glossary/systemd/\" data-glossary-more=\"Read more →\"\u003esystemd\u003c/span\u003e\n services, four different ports, four different Unix sockets, four different data directories. A setup someone had put together years earlier — probably to save on a second server — and that no one had touched or documented since.\u003c/p\u003e","title":"MySQL Multi-Instance: A Ticket, a CSV and the secure-file-priv Wall"},{"content":"","date":null,"permalink":"https://ivanluminaria.com/en/tags/secure-file-priv/","section":"Tags","summary":"","title":"Secure-File-Priv"},{"content":"","date":null,"permalink":"https://ivanluminaria.com/en/tags/socket/","section":"Tags","summary":"","title":"Socket"},{"content":"","date":null,"permalink":"https://ivanluminaria.com/en/tags/systemd/","section":"Tags","summary":"","title":"Systemd"},{"content":"","date":null,"permalink":"https://ivanluminaria.com/en/tags/troubleshooting/","section":"Tags","summary":"","title":"Troubleshooting"},{"content":"","date":null,"permalink":"https://ivanluminaria.com/en/tags/explain/","section":"Tags","summary":"","title":"Explain"},{"content":"The other day a colleague sends me a screenshot on Teams. A query running on a 2-million-row table, 45 seconds execution time. He writes:\n\u0026ldquo;I ran EXPLAIN ANALYZE, but I can\u0026rsquo;t figure out what\u0026rsquo;s wrong. The plan looks fine.\u0026rdquo;\nSpoiler: the plan was anything but fine. The optimizer had chosen a nested loop join where a hash join was needed, and the reason was trivial — stale statistics. But to get there I had to read the plan line by line, and that\u0026rsquo;s when I realized that most DBAs I know use EXPLAIN ANALYZE as a binary oracle: if the time is high, the query is slow. End of analysis.\nNo. EXPLAIN ANALYZE is a diagnostic tool, not a verdict. You need to know how to read it.\n🔧 EXPLAIN, EXPLAIN ANALYZE, EXPLAIN (ANALYZE, BUFFERS): three different things #Let\u0026rsquo;s start with the basics, because the confusion is more widespread than you\u0026rsquo;d think.\nEXPLAIN alone shows the estimated plan. The optimizer decides what it would do, but doesn\u0026rsquo;t execute anything. Useful for understanding the strategy, useless for understanding reality.\nEXPLAIN SELECT * FROM orders o JOIN customers c ON c.id = o.customer_id WHERE o.created_at \u0026gt; \u0026#39;2025-01-01\u0026#39;; EXPLAIN ANALYZE actually runs the query and adds real timings. Now you can see how long each node took, how many rows it actually returned. But there\u0026rsquo;s a missing piece.\nEXPLAIN ANALYZE SELECT * FROM orders o JOIN customers c ON c.id = o.customer_id WHERE o.created_at \u0026gt; \u0026#39;2025-01-01\u0026#39;; EXPLAIN (ANALYZE, BUFFERS) is what I always use. It adds information about how many disk pages were read, how many were in cache (shared hit) and how many had to be loaded from disk (shared read). Without BUFFERS you\u0026rsquo;re driving at night with no headlights.\nEXPLAIN (ANALYZE, BUFFERS) SELECT * FROM orders o JOIN customers c ON c.id = o.customer_id WHERE o.created_at \u0026gt; \u0026#39;2025-01-01\u0026#39;; Personal rule: if someone sends me an EXPLAIN without BUFFERS, I send it back.\n📖 Anatomy of a node: what to read and in what order #An execution plan is a tree. Each node looks like this:\n-\u0026gt; Hash Join (cost=1234.56..5678.90 rows=50000 width=120) (actual time=12.345..89.012 rows=48750 loops=1) Buffers: shared hit=1200 read=3400 Here\u0026rsquo;s what to look at:\ncost — two numbers separated by ... The first is the startup cost (how much before returning the first row), the second is the total estimated cost. These are arbitrary optimizer units, not milliseconds. They\u0026rsquo;re useful for comparing alternative plans, not for measuring absolute performance.\nrows — the rows estimated by the optimizer. Compare them with actual rows. If there\u0026rsquo;s an order of magnitude difference, you\u0026rsquo;ve found the problem.\nactual time — real time in milliseconds. Again two values: startup and total. Watch the loops field: if loops=10, the total time should be multiplied by 10.\nBuffers — shared hit are pages found in memory, shared read are pages read from disk. If read dominates, your working set doesn\u0026rsquo;t fit in RAM.\n🚨 The number one red flag: estimated rows vs actual rows #Back to my colleague\u0026rsquo;s case. The plan showed:\n-\u0026gt; Nested Loop (cost=0.87..45678.12 rows=150 width=200) (actual time=0.034..44890.123 rows=1950000 loops=1) The optimizer estimated 150 rows. In reality, almost 2 million arrived.\nWhen the estimate is off by 4 orders of magnitude, the plan is inevitably wrong. The optimizer chose a nested loop because it thought it was iterating over 150 rows. A nested loop on 150 rows is lightning fast. On 2 million, it\u0026rsquo;s a disaster.\nA hash join or merge join would have been the right choice. But the optimizer couldn\u0026rsquo;t know that with the statistics it had.\nRule of thumb: if the ratio between estimated and actual rows exceeds 10x, you have a statistics problem. Above 100x, the plan is almost certainly wrong.\n🔍 Why statistics lie #PostgreSQL maintains table statistics in pg_statistic (readable through pg_stats). These statistics include:\nvalue distribution (most common values) value histogram number of distinct values NULL percentage The optimizer uses this information to estimate the selectivity of every WHERE condition and the cardinality of every join.\nThe problem? Statistics are updated by `ANALYZE` — which can be manual or handled by autovacuum. But autovacuum triggers ANALYZE only when the number of modified rows exceeds a threshold:\nthreshold = autovacuum_analyze_threshold + autovacuum_analyze_scale_factor × n_live_tuples Defaults: 50 rows + 10% of live tuples. On a 2-million-row table, that means 200,000 modifications before an automatic ANALYZE kicks in.\nIn my colleague\u0026rsquo;s case, the orders table had grown from 500,000 to 2 million rows in three weeks — a massive import from a legacy system. Autovacuum hadn\u0026rsquo;t refreshed the statistics because 10% of 500,000 (the known size) was 50,000, and the rows had been inserted in batches that individually never crossed the threshold.\nResult: the optimizer was still reasoning as if the table had 500,000 rows with the old value distribution.\n🛠️ Updating statistics: the first thing to do #The immediate solution was obvious:\nANALYZE orders; After the ANALYZE, I re-ran the query with EXPLAIN (ANALYZE, BUFFERS):\n-\u0026gt; Hash Join (cost=8500.00..32000.00 rows=1940000 width=200) (actual time=120.000..2800.000 rows=1950000 loops=1) Buffers: shared hit=28000 read=4500 From 45 seconds to under 3 seconds. The optimizer had chosen a hash join, the row estimate was accurate, and the plan was completely different.\nBut I didn\u0026rsquo;t stop there. If the problem happened once, it will happen again.\n📊 default_statistics_target: when 100 is not enough #PostgreSQL collects 100 sample values per column by default. For small tables or uniform distributions, that\u0026rsquo;s fine. For large tables with skewed distributions, 100 samples can give a distorted picture.\nIn the orders table case, the customer_id column had a very skewed distribution: 5% of customers generated 60% of orders. With 100 samples, the optimizer couldn\u0026rsquo;t capture this asymmetry.\nThe solution:\nALTER TABLE orders ALTER COLUMN customer_id SET STATISTICS 500; ANALYZE orders; After raising the target to 500, the optimizer\u0026rsquo;s cardinality estimates for joins with customers became much more accurate.\nRule: if a column is frequently used in WHERE or JOIN clauses and has non-uniform distribution, raise the target. 500 is a good starting point. You can go up to 1000, but beyond that it rarely helps and slows down ANALYZE itself.\n⚠️ When to force the planner: enable_nestloop and enable_hashjoin #Sometimes, even with fresh statistics, the optimizer takes the wrong path. It happens with complex queries, many joined tables, or when column correlations mislead the estimates.\nPostgreSQL offers parameters to disable specific strategies:\nSET enable_nestloop = off; This forces the optimizer not to use nested loops. It\u0026rsquo;s not a solution, it\u0026rsquo;s a diagnostic band-aid. If you disable nested loops and the query drops from 45 seconds to 3 seconds, you\u0026rsquo;ve confirmed the join strategy was the problem. But you can\u0026rsquo;t leave enable_nestloop = off in production because there are a thousand queries where nested loops are the right choice.\nI use these parameters in only two scenarios:\nDiagnostics: to confirm which join strategy is the problem Emergency: when the business is down and you need to get a critical query running while you look for the real fix After diagnostics, the correct fix is always on statistics, indexes, or query rewriting.\n📋 My workflow when a query is slow #After thirty years doing this job, my process has become almost mechanical:\n1. EXPLAIN (ANALYZE, BUFFERS) — always with BUFFERS. I save the complete output, not just the last few lines.\n2. Look for row discrepancies — I compare estimated rows= with actual rows= on every node. I start from the leaf nodes and work up to the root. The first significant discrepancy is almost always the cause.\n3. Check the statistics — I look at pg_stats for the involved columns. I verify last_autoanalyze and last_analyze in pg_stat_user_tables. If the last ANALYZE is old, I run it and re-evaluate.\n4. Evaluate BUFFERS — if shared read is very high compared to shared hit, the problem might be I/O, not the plan. In that case the fix is shared_buffers or the working set simply doesn\u0026rsquo;t fit in RAM.\n5. Test alternatives — if statistics are fresh but the plan is still wrong, I use enable_nestloop, enable_hashjoin, enable_mergejoin to understand which strategy works best. Then I try to guide the optimizer toward that strategy with indexes or query rewriting.\nNothing spectacular. No magic tricks. Just systematic reading of the plan, one line at a time.\n💬 The lesson from that day #My colleague, after seeing the difference, told me: \u0026ldquo;So all it took was an ANALYZE?\u0026rdquo;\nYes and no. In that specific case, yes. But the point isn\u0026rsquo;t the command. The point is knowing how to read the plan to understand where to look. EXPLAIN ANALYZE gives you the data. It\u0026rsquo;s up to you to interpret it.\nI\u0026rsquo;ve seen DBAs with years of experience run EXPLAIN ANALYZE, look at the total time at the bottom, and say \u0026ldquo;the query is slow.\u0026rdquo; It\u0026rsquo;s like checking a patient\u0026rsquo;s temperature and saying \u0026ldquo;they have a fever.\u0026rdquo; Sure, but what\u0026rsquo;s causing it?\nThe execution plan tells you what\u0026rsquo;s causing it. Each node is an organ. Estimated rows versus actual rows are the lab results. Buffers are the X-rays. And ANALYZE is the antibiotic that solves 70% of cases.\nBut for that remaining 30%, you need to read. Line by line. Node by node. There\u0026rsquo;s no shortcut.\nGlossary #Execution Plan — the sequence of operations (scan, join, sort) the database chooses to resolve a SQL query. Viewed with EXPLAIN and EXPLAIN ANALYZE.\nNested Loop — a join strategy that for each row in the outer table looks for matches in the inner table. Ideal for few rows, disastrous on large volumes when mistakenly chosen by the optimizer.\nHash Join — a join strategy that builds a hash table from the smaller table and then scans the larger one looking for matches with O(1) lookups. Efficient on large volumes without indexes.\nANALYZE — PostgreSQL command that collects statistics on data distribution in tables, used by the optimizer to estimate cardinality and choose the execution plan.\ndefault_statistics_target — PostgreSQL parameter that defines how many samples to collect per column during ANALYZE. The default is 100; for columns with skewed distribution it should be raised to 500-1000.\n","date":"28 October 2025","permalink":"https://ivanluminaria.com/en/posts/postgresql/explain-analyze-postgresql/","section":"Database Strategy","summary":"\u003cp\u003eThe other day a colleague sends me a screenshot on Teams. A query running on a 2-million-row table, 45 seconds execution time. He writes:\u003c/p\u003e\n\u003cblockquote\u003e\n\u003cp\u003e\u0026ldquo;I ran EXPLAIN ANALYZE, but I can\u0026rsquo;t figure out what\u0026rsquo;s wrong. The plan looks fine.\u0026rdquo;\u003c/p\u003e\n\u003c/blockquote\u003e\n\u003cp\u003eSpoiler: the plan was anything but fine. The optimizer had chosen a \u003cspan class=\"glossary-tip\" tabindex=\"0\" data-glossary-desc=\"Nested Loop Join — the join strategy that scans the inner table for each row of the outer table, ideal for small datasets with an index.\" data-glossary-url=\"/en/glossary/nested-loop/\" data-glossary-more=\"Read more →\"\u003enested loop\u003c/span\u003e\n join where a \u003cspan class=\"glossary-tip\" tabindex=\"0\" data-glossary-desc=\"Hash Join — a join strategy optimized for large data volumes, based on a hash table built in memory.\" data-glossary-url=\"/en/glossary/hash-join/\" data-glossary-more=\"Read more →\"\u003ehash join\u003c/span\u003e\n was needed, and the reason was trivial — stale statistics. But to get there I had to read the plan line by line, and that\u0026rsquo;s when I realized that most DBAs I know use EXPLAIN ANALYZE as a binary oracle: if the time is high, the query is slow. End of analysis.\u003c/p\u003e","title":"EXPLAIN ANALYZE is not enough: how to actually read a PostgreSQL execution plan"},{"content":"","date":null,"permalink":"https://ivanluminaria.com/en/tags/optimizer/","section":"Tags","summary":"","title":"Optimizer"},{"content":"","date":null,"permalink":"https://ivanluminaria.com/en/tags/fact-table/","section":"Tags","summary":"","title":"Fact-Table"},{"content":"","date":null,"permalink":"https://ivanluminaria.com/en/tags/grain/","section":"Tags","summary":"","title":"Grain"},{"content":"","date":null,"permalink":"https://ivanluminaria.com/en/tags/granularity/","section":"Tags","summary":"","title":"Granularity"},{"content":"The meeting had started well. The sales director of an industrial distribution company — around sixty million in revenue, three thousand active customers, a catalog of twelve thousand SKUs — had opened the new data warehouse presentation with a smile. The numbers matched, the dashboards were polished, the monthly totals by agent and territory reconciled with accounting.\nThen someone asked the wrong question. Or rather, the right one.\n\u0026ldquo;Can I see what customer Bianchi purchased in March, line by line, product by product?\u0026rdquo;\nSilence.\nThe BI manager looked at me. I looked at the screen. The screen showed a fact table with one row per customer per month: total invoiced amount, total quantity, invoice count. No detail. No invoice line. No product.\nThat fact table answered one question only: how much did each customer invoice in a given month? Everything else — by product, by product family, by individual invoice — was out of reach.\n🔍 The grain: the decision that determines everything #In dimensional modeling , the grain of the fact table is the first decision you make. Not the second, not one among many: the first. Kimball repeats it in every chapter, and he\u0026rsquo;s right.\nThe grain answers the question: what does a single row in the fact table represent?\nIn the project I described, the original designer had chosen a monthly-customer grain: one row = one customer in one month. The reasons seemed sound: the source system exported a monthly summary, loading was fast, tables were small, queries were simple.\nBut the grain determines which questions the data warehouse can answer. If the grain is a monthly summary per customer, you can\u0026rsquo;t go below that level. You can\u0026rsquo;t drill down by product. You can\u0026rsquo;t tell whether customer Bianchi bought the same item ten times or ten different items. You can\u0026rsquo;t compare margins by product family.\nYou have a total. Period.\n📊 The problem in numbers #The original fact table had this structure:\nCREATE TABLE fact_monthly_revenue ( sk_customer INT NOT NULL, sk_time INT NOT NULL, -- month (YYYYMM) sk_agent INT NOT NULL, sk_territory INT NOT NULL, total_amount DECIMAL(15,2), total_quantity INT, num_invoices INT, num_lines INT, FOREIGN KEY (sk_customer) REFERENCES dim_customer(sk_customer), FOREIGN KEY (sk_time) REFERENCES dim_time(sk_time) ); Rows per year: about 180,000 (3,000 customers × 12 months × some variation). Small, fast, easy to load. The ETL ran in under five minutes.\nThe problem? The additive measures were already aggregated. total_amount was the sum of all invoice lines for the month. No way to trace back to the composition. Like having a receipt total without knowing what you bought.\n🏗️ The restructuring: going down to the invoice line #There was only one solution: change the grain. Bring the fact table down to the lowest level available in the source system — the individual invoice line.\nCREATE TABLE fact_revenue_line ( sk_invoice_line INT PRIMARY KEY, sk_invoice INT NOT NULL, sk_customer INT NOT NULL, sk_product INT NOT NULL, sk_time INT NOT NULL, -- day (YYYYMMDD) sk_agent INT NOT NULL, sk_territory INT NOT NULL, sk_family INT NOT NULL, quantity INT, unit_price DECIMAL(12,4), line_amount DECIMAL(15,2), discount_pct DECIMAL(5,2), net_amount DECIMAL(15,2), product_cost DECIMAL(15,2), margin DECIMAL(15,2), FOREIGN KEY (sk_customer) REFERENCES dim_customer(sk_customer), FOREIGN KEY (sk_product) REFERENCES dim_product(sk_product), FOREIGN KEY (sk_time) REFERENCES dim_time(sk_time) ); Rows per year: about 2.4 million (3,000 customers × ~800 lines/year on average). An order of magnitude more. But every row carried the full detail: which product, which invoice, which price, which discount, which margin.\n⚡ The ETL impact #The grain change had a cascading effect on the ETL that nobody had anticipated — or rather, that whoever chose the aggregated grain had avoided facing.\nNew dimensions required:\nDimension Cardinality Notes dim_product ~12,000 Didn\u0026rsquo;t exist before: wasn\u0026rsquo;t needed dim_family ~180 3-level product hierarchy dim_invoice ~45,000/yr Invoice header with master data New loading window:\nPhase Before After Extraction 40 sec 3 min Transformation 1 min 8 min Fact loading 30 sec 4 min Total ~2 min ~15 min Fifteen minutes versus two. An acceptable price for a data warehouse that now answered real questions.\n🔬 Queries that were previously impossible #With the new grain, the queries the business wanted became trivial:\nCustomer purchases by product:\nSELECT c.company_name, p.product_code, p.description, SUM(f.quantity) AS units, SUM(f.net_amount) AS net_revenue, SUM(f.margin) AS total_margin FROM fact_revenue_line f JOIN dim_customer c ON f.sk_customer = c.sk_customer JOIN dim_product p ON f.sk_product = p.sk_product JOIN dim_time t ON f.sk_time = t.sk_time WHERE c.company_name = \u0026#39;Bianchi Srl\u0026#39; AND t.year = 2024 AND t.month = 3 GROUP BY c.company_name, p.product_code, p.description ORDER BY net_revenue DESC; Top 10 products by margin in a quarter:\nSELECT p.product_code, p.description, fm.family_desc, SUM(f.net_amount) AS revenue, SUM(f.margin) AS margin, ROUND(SUM(f.margin) / NULLIF(SUM(f.net_amount), 0) * 100, 1) AS margin_pct FROM fact_revenue_line f JOIN dim_product p ON f.sk_product = p.sk_product JOIN dim_family fm ON f.sk_family = fm.sk_family JOIN dim_time t ON f.sk_time = t.sk_time WHERE t.year = 2024 AND t.quarter = 1 GROUP BY p.product_code, p.description, fm.family_desc ORDER BY margin DESC LIMIT 10; Agent comparison: average revenue per invoice line:\nSELECT a.agent_name, COUNT(*) AS num_lines, SUM(f.net_amount) AS total_revenue, ROUND(AVG(f.net_amount), 2) AS avg_per_line FROM fact_revenue_line f JOIN dim_agent a ON f.sk_agent = a.sk_agent JOIN dim_time t ON f.sk_time = t.sk_time WHERE t.year = 2024 GROUP BY a.agent_name ORDER BY total_revenue DESC; None of these queries was possible with the monthly-customer grain. None. It wasn\u0026rsquo;t a matter of tuning or indexing — it was a structural problem, written in the model\u0026rsquo;s DNA.\n📋 The Kimball rule we had ignored #Ralph Kimball puts it plainly: \u0026ldquo;always model at the finest level of detail available in the source system.\u0026rdquo;\nThis isn\u0026rsquo;t a suggestion. It\u0026rsquo;s not one option among many. It\u0026rsquo;s the founding principle of dimensional modeling. And the reason is simple: you can always aggregate from detail to total, but you can never disaggregate a total back into its detail.\nAggregation is an irreversible operation. Like mixing colors: from red and yellow you can get orange, but from orange you can never go back to the original colors.\nIn our project, the choice of an aggregated grain was driven by design laziness, not by a technical constraint. The source system had line-level detail — nobody had wanted to deal with the complexity of modeling it, managing the additional dimensions, extending the ETL window.\nThe result? A data warehouse that had to be rebuilt from scratch six months after go-live.\n🎯 When an aggregated grain makes sense #A fine grain isn\u0026rsquo;t always the only answer. There are legitimate cases for aggregated fact tables:\nAggregate fact tables alongside the detail table, to speed up the most frequent queries Periodic snapshots where the business genuinely thinks in periods (monthly account balance, end-of-week inventory) Source constraints when the upstream system doesn\u0026rsquo;t expose detail and there\u0026rsquo;s no way to get it But the rule is: start from detail, then aggregate. Never the other way around. Aggregate fact tables are an optimization, not a substitute for fine grain.\nIn our case, after the restructuring, we also created a materialized view with the monthly summary per customer — the same structure as before — for executive dashboards that didn\u0026rsquo;t need the detail. The best of both worlds, without sacrificing anything.\nWhat I learned #That project taught me something I carry into every engagement since: the first half-hour of data warehouse design, the one where you decide the grain, is worth more than all the optimizations that come later. A flawless ETL, perfectly tuned indexes, powerful hardware — none of it compensates for the wrong grain.\nIf your fact table can\u0026rsquo;t answer the business\u0026rsquo;s questions, it\u0026rsquo;s not the queries\u0026rsquo; fault. It\u0026rsquo;s the model\u0026rsquo;s fault. And the model is decided at the grain.\nGlossary #Grain — The level of detail of a fact table in a data warehouse. Determines what each row represents and which questions the model can answer. It\u0026rsquo;s the first decision in dimensional design.\nFact table — The central table in a star schema containing numeric measures (amounts, quantities, margins) and foreign keys to dimensions. Its grain determines the level of analysis possible.\nAdditive Measure — A numeric measure that can be summed across all dimensions (e.g., amount, quantity). Once aggregated to a higher level, the original detail is irreversibly lost.\nDrill-down — Navigation in reports from an aggregated level to detail, along a hierarchy. Only possible if the fact table contains data at a sufficient grain level.\nStar Schema — A data model with a central fact table and linked dimension tables. The most common structure in data warehouses for simple, fast analytical queries.\nETL — Extract, Transform, Load: the process of extracting, transforming, and loading data into a data warehouse. A grain change directly impacts ETL duration and complexity.\n","date":"21 October 2025","permalink":"https://ivanluminaria.com/en/posts/data-warehouse/fatto-grana-sbagliata/","section":"Database Strategy","summary":"\u003cp\u003eThe meeting had started well. The sales director of an industrial distribution company — around sixty million in revenue, three thousand active customers, a catalog of twelve thousand SKUs — had opened the new data warehouse presentation with a smile. The numbers matched, the dashboards were polished, the monthly totals by agent and territory reconciled with accounting.\u003c/p\u003e\n\u003cp\u003eThen someone asked the wrong question. Or rather, the right one.\u003c/p\u003e\n\u003cp\u003e\u003cem\u003e\u0026ldquo;Can I see what customer Bianchi purchased in March, line by line, product by product?\u0026rdquo;\u003c/em\u003e\u003c/p\u003e","title":"Wrong grain: when the fact table can't answer the right questions"},{"content":"","date":null,"permalink":"https://ivanluminaria.com/en/tags/binary-log/","section":"Tags","summary":"","title":"Binary-Log"},{"content":"The alert came on a Monday morning, wedged between three meetings and a coffee that was still hot. \u0026ldquo;Filesystem /mysql at 85% on the primary node.\u0026rdquo; On another node it was 66%, on the third 25%. In a cluster, when the numbers don\u0026rsquo;t match across nodes, there\u0026rsquo;s always something going on underneath.\nThe first question that comes to mind is \u0026ldquo;how much space do we need?\u0026rdquo; But that\u0026rsquo;s the wrong question. The right one is: \u0026ldquo;why is it filling up?\u0026rdquo;\nThe cause: binary logs on the wrong volume #Checking was quick:\nSHOW VARIABLES LIKE \u0026#39;log_bin\u0026#39;; Result: ON. Binary logs were active — as expected in a cluster. But the path was the issue:\nSHOW VARIABLES LIKE \u0026#39;log_bin_basename\u0026#39;; /mysql/bin_log/binlog The binlogs were sitting on the same volume as the data: /mysql. A roughly 3 TB volume that on one node was already at 85%.\nI also checked retention:\nSHOW VARIABLES LIKE \u0026#39;binlog_expire_logs_seconds\u0026#39;; 2592000 Thirty days. Then I wanted to understand how much this configuration actually weighed. I checked the size of individual binlog files and the write rate: each file was roughly 1 GB, and the server was generating one every two hours. Twelve files a day, times thirty days of retention: approximately 360 GB of binary logs on the main volume. On a 3 TB volume shared with the data, binlogs alone were eating over 10% of the space. And those files don\u0026rsquo;t just sit on the primary — in Group Replication each node writes its own local binlogs for synchronization, so the problem was multiplied across all three nodes.\nThe picture was clear: binary logs were eating up the main filesystem. Not a bug, not a runaway table. Just an architectural choice made at installation time and never revisited.\nWhat kind of cluster is this, exactly? #Before touching anything on a MySQL server — before even thinking about moving a file — you need to know what you\u0026rsquo;re dealing with. \u0026ldquo;It\u0026rsquo;s a cluster\u0026rdquo; isn\u0026rsquo;t enough. MySQL has at least four different ways of doing high availability, and each one has its own rules.\nI started with classic replication:\nSHOW SLAVE STATUS\\G Empty set on both nodes I checked. No traditional replication running.\nThen I tried SHOW REPLICA STATUS — but on MySQL 8.0.20 that command doesn\u0026rsquo;t exist yet. It was introduced in 8.0.22. A detail that online documentation often forgets to mention, leaving you chasing a syntax error that isn\u0026rsquo;t one.\nNext step — Group Replication:\nSELECT MEMBER_HOST, MEMBER_STATE, MEMBER_ROLE FROM performance_schema.replication_group_members; And there it was:\nMEMBER_HOST MEMBER_STATE MEMBER_ROLE dbcluster01 ONLINE SECONDARY dbcluster02 ONLINE SECONDARY dbcluster03 ONLINE PRIMARY Three nodes. All ONLINE. One primary, two secondaries. Group Replication in single-primary mode.\nFinal confirmation from the plugins:\nSHOW PLUGINS; In the list: group_replication | ACTIVE | GROUP REPLICATION | group_replication.so. And from the configuration:\nSHOW VARIABLES LIKE \u0026#39;group_replication_single_primary_mode\u0026#39;; ON Now I knew exactly what I was dealing with. Not classic replication, not Galera, not NDB Cluster. A MySQL Group Replication single-primary with three nodes, GTID enabled, ROW-based binlog format. The full picture.\nThe temptation is always to skip this phase. \u0026ldquo;I know it\u0026rsquo;s a cluster, let\u0026rsquo;s move.\u0026rdquo; But skipping diagnosis on a cluster is like operating without a CT scan: you might get lucky, or you might cause a disaster.\nThe solution: a dedicated volume for binary logs #The strategy was straightforward: binlogs need their own volume. Not on the same filesystem as the data, not on an improvised symlink, not on a shared directory. A dedicated volume, mounted at the same path on all three nodes.\nI asked the sysadmins to provision a new 600 GB volume with mount point /mysql/binary_logs on each of the three nodes.\nWhen the volume was ready, I verified on all three:\ndf -h /mysql/binary_logs Node /mysql /mysql/binary_logs dbcluster03 (PRIMARY) 85% 1% dbcluster02 (SECONDARY) 66% 1% dbcluster01 (SECONDARY) 25% 1% Fresh, dedicated space. Each volume on a local disk belonging to the respective VM — three disks, three volumes, same mountpoint across all three nodes. The sysadmins had done a clean job.\nThe checks before touching MySQL #Before stopping the first node, I ran three checks that I consider mandatory.\nDirectory permissions. MySQL won\u0026rsquo;t start if it can\u0026rsquo;t write to the binlog directory. Sounds obvious, but it\u0026rsquo;s one of the most common reasons for \u0026ldquo;why won\u0026rsquo;t it restart after the config change.\u0026rdquo;\nls -ld /mysql/binary_logs On all three nodes the permissions were 755. It works, but it\u0026rsquo;s not great security-wise — binlogs can contain sensitive data. I changed them to 750:\nchmod 750 /mysql/binary_logs Result: drwxr-x--- mysql mysql. Only the mysql user can read and write.\nReal write test. Before letting MySQL write there, I verified the filesystem was responding:\ntouch /mysql/binary_logs/testfile ls -l /mysql/binary_logs/testfile rm -f /mysql/binary_logs/testfile If the touch fails, the problem is storage or permissions — and better to find out now than after a MySQL restart.\nCluster state. The last check before proceeding:\nSELECT MEMBER_HOST, MEMBER_STATE, MEMBER_ROLE FROM performance_schema.replication_group_members; Three nodes ONLINE. Quorum intact. Ready to go.\nThe strategy: one node at a time, primary last #In a three-node Group Replication, quorum is two. If you stop one node, the other two keep the group alive. If you stop two — you\u0026rsquo;ve lost the cluster.\nThe rule is simple: one node at a time, waiting for the previous one to rejoin the group before touching the next. And the primary goes last.\nWhy? Because when you stop the primary, something important happens: the cluster triggers an automatic election and one of the secondaries becomes the new primary. During those seconds — just a few, if everything is healthy — active connections may be dropped, in-flight transactions may fail. It\u0026rsquo;s a brief disruption, but it\u0026rsquo;s a disruption. It needs to be communicated.\nThe order I followed:\ndbcluster01 (SECONDARY) dbcluster02 (SECONDARY) dbcluster03 (PRIMARY) The procedure, node by node #On each node the sequence is identical:\nA. Verify the node\u0026rsquo;s role. Before stopping it, confirm it is what you think:\nSELECT MEMBER_HOST, MEMBER_STATE, MEMBER_ROLE FROM performance_schema.replication_group_members; B. Stop MySQL:\nsystemctl stop mysqld C. Modify the configuration. In my.cnf, change the log_bin parameter:\nFrom:\nlog_bin=/mysql/bin_log/binlog To:\nlog_bin=/mysql/binary_logs/mysql-bin One line. One single change. Don\u0026rsquo;t touch the Group Replication parameters, don\u0026rsquo;t change the server_id, don\u0026rsquo;t reinvent the engine while you\u0026rsquo;re changing a tyre.\nD. Start MySQL:\nsystemctl start mysqld E. Verify. Three things to check:\nThe new path:\nSHOW VARIABLES LIKE \u0026#39;log_bin_basename\u0026#39;; Must return /mysql/binary_logs/mysql-bin.\nCluster membership:\nSELECT MEMBER_HOST, MEMBER_STATE, MEMBER_ROLE FROM performance_schema.replication_group_members; The node must come back ONLINE.\nNew binlogs on the new path:\nls -lh /mysql/binary_logs/ New mysql-bin.000001 files should appear.\nOnly when the node is ONLINE and the cluster shows three active nodes again do you move to the next one. Not before.\nFor the primary — dbcluster03 — the procedure is identical, but before stopping it I verified that both secondaries were ONLINE and already migrated. At the moment of the stop, the cluster triggered the election. One of the secondaries became primary. Brief disruption, as expected.\nWhat not to do #From my experience, these are the most common traps in this kind of intervention:\nDon\u0026rsquo;t copy old binlogs to the new path. In Group Replication there\u0026rsquo;s no need for binary archaeology. New binlogs will be created in the new directory after the restart. The old ones are only needed if you require point-in-time recovery — and in that case you already know where to find them.\nDon\u0026rsquo;t touch two nodes at the same time. With three nodes, quorum is sacred. One node at a time, no exceptions. If you stop two together, you\u0026rsquo;re playing blindfolded Jenga.\nDon\u0026rsquo;t start with the primary. Always secondaries first, primary last. Doing it the other way round is the elegant way to invite chaos to dinner.\nDon\u0026rsquo;t delete old binlogs immediately. After the change, the old path /mysql/bin_log/ won\u0026rsquo;t be used for new files. But don\u0026rsquo;t rush to rm -rf /mysql/bin_log/*. Wait. Verify that the cluster is stable, that new binlogs are being written to the new mount, that there are no errors in the MySQL log. Only after a few days of observation should you think about cleanup.\nDon\u0026rsquo;t just trust the fact that \u0026ldquo;MySQL started\u0026rdquo;. MySQL can start but fail to rejoin the group. You need to verify three things: log_bin_basename points to the new path, the node is ONLINE in replication_group_members, and binlog files are actually being written in the new directory.\nWhat this operation really teaches #A filesystem at 92% isn\u0026rsquo;t an emergency — it\u0026rsquo;s a signal. The real problem wasn\u0026rsquo;t disk space; it was an architectural choice made at installation time and never revisited: binlogs and data on the same volume.\nSeparating binary logs onto a dedicated volume isn\u0026rsquo;t just a fix. It\u0026rsquo;s infrastructure hardening. It\u0026rsquo;s the difference between a system that \u0026ldquo;works\u0026rdquo; and one that\u0026rsquo;s designed to keep working as things grow.\nAnd the most important part of the entire intervention wasn\u0026rsquo;t the my.cnf change — that\u0026rsquo;s one line. The important part was the diagnosis: understanding what kind of cluster I was facing, checking the state of every node, preparing the storage, testing permissions, planning the execution order. All before touching a single parameter.\nA senior DBA and a junior DBA both know the systemctl stop mysqld command. The difference is everything that happens before it.\nGlossary #Group Replication — MySQL\u0026rsquo;s native mechanism for synchronous multi-node replication with automatic failover and quorum management. Supports single-primary and multi-primary modes.\nBinary log — MySQL\u0026rsquo;s sequential binary record that tracks all data modifications (INSERT, UPDATE, DELETE, DDL), used for replication and point-in-time recovery.\nGTID — Global Transaction Identifier — unique identifier assigned to every transaction in MySQL, simplifying replication management and transaction tracking across cluster nodes.\nQuorum — Minimum number of nodes that must be active and communicating for a cluster to continue operating. In a 3-node cluster, quorum is 2.\nSingle-primary — Group Replication mode where only one node accepts writes while the others are read-only with automatic failover.\n","date":"14 October 2025","permalink":"https://ivanluminaria.com/en/posts/mysql/mysql-group-replication-binlog-migration/","section":"Database Strategy","summary":"\u003cp\u003eThe alert came on a Monday morning, wedged between three meetings and a coffee that was still hot. \u0026ldquo;Filesystem /mysql at 85% on the primary node.\u0026rdquo; On another node it was 66%, on the third 25%. In a cluster, when the numbers don\u0026rsquo;t match across nodes, there\u0026rsquo;s always something going on underneath.\u003c/p\u003e\n\u003cp\u003eThe first question that comes to mind is \u0026ldquo;how much space do we need?\u0026rdquo; But that\u0026rsquo;s the wrong question. The right one is: \u0026ldquo;why is it filling up?\u0026rdquo;\u003c/p\u003e","title":"Full disk on a MySQL cluster: binary logs, Group Replication, and a migration that leaves no room for mistakes"},{"content":"","date":null,"permalink":"https://ivanluminaria.com/en/tags/group-replication/","section":"Tags","summary":"","title":"Group-Replication"},{"content":"","date":null,"permalink":"https://ivanluminaria.com/en/tags/innodb-cluster/","section":"Tags","summary":"","title":"Innodb-Cluster"},{"content":"An additive measure is a numeric value in a fact table that can be legitimately summed across any dimension: by customer, by product, by period, by territory.\nHow it works #Measures in fact tables fall into three categories:\nAdditive: can be summed across all dimensions (e.g., sales amount, quantity, cost). The most common and most useful Semi-additive: can be summed across some dimensions but not across time (e.g., account balance: summable by branch, not by month) Non-additive: cannot be summed at all (e.g., percentages, ratios, pre-calculated averages) What it\u0026rsquo;s for #Additive measures are the heart of every fact table because they enable the aggregations that the business requires: totals by period, by region, by product. The key rule: always store atomic values (the detail), never aggregates. From a line-level amount you can derive the monthly total; from a monthly total you cannot reconstruct the individual lines.\nWhen to use it #When designing a fact table, every measure should be classified as additive, semi-additive, or non-additive. This determines which aggregations are valid in reports and which would produce incorrect results. A common mistake is treating a semi-additive measure (like a balance) as if it were additive — summing monthly balances to get a \u0026ldquo;total\u0026rdquo; that has no business meaning.\n","date":"1 January 0001","permalink":"https://ivanluminaria.com/en/glossary/additive-measure/","section":"Glossary","summary":"\u003cp\u003eAn \u003cstrong\u003eadditive measure\u003c/strong\u003e is a numeric value in a fact table that can be legitimately summed across any dimension: by customer, by product, by period, by territory.\u003c/p\u003e\n\u003ch2 id=\"how-it-works\" class=\"relative group\"\u003eHow it works \u003cspan class=\"absolute top-0 w-6 transition-opacity opacity-0 -start-6 not-prose group-hover:opacity-100\"\u003e\u003ca class=\"group-hover:text-primary-300 dark:group-hover:text-neutral-700\" style=\"text-decoration-line: none !important;\" href=\"#how-it-works\" aria-label=\"Anchor\"\u003e#\u003c/a\u003e\u003c/span\u003e\u003c/h2\u003e\u003cp\u003eMeasures in fact tables fall into three categories:\u003c/p\u003e\n\u003cul\u003e\n\u003cli\u003e\u003cstrong\u003eAdditive\u003c/strong\u003e: can be summed across all dimensions (e.g., sales amount, quantity, cost). The most common and most useful\u003c/li\u003e\n\u003cli\u003e\u003cstrong\u003eSemi-additive\u003c/strong\u003e: can be summed across some dimensions but not across time (e.g., account balance: summable by branch, not by month)\u003c/li\u003e\n\u003cli\u003e\u003cstrong\u003eNon-additive\u003c/strong\u003e: cannot be summed at all (e.g., percentages, ratios, pre-calculated averages)\u003c/li\u003e\n\u003c/ul\u003e\n\u003ch2 id=\"what-its-for\" class=\"relative group\"\u003eWhat it\u0026rsquo;s for \u003cspan class=\"absolute top-0 w-6 transition-opacity opacity-0 -start-6 not-prose group-hover:opacity-100\"\u003e\u003ca class=\"group-hover:text-primary-300 dark:group-hover:text-neutral-700\" style=\"text-decoration-line: none !important;\" href=\"#what-its-for\" aria-label=\"Anchor\"\u003e#\u003c/a\u003e\u003c/span\u003e\u003c/h2\u003e\u003cp\u003eAdditive measures are the heart of every fact table because they enable the aggregations that the business requires: totals by period, by region, by product. The key rule: always store atomic values (the detail), never aggregates. From a line-level amount you can derive the monthly total; from a monthly total you cannot reconstruct the individual lines.\u003c/p\u003e","title":"Additive Measure"},{"content":"The AI Manager is the professional role that governs the introduction and use of artificial intelligence within a project or organization. They are not the one who uses AI — they are the one who decides where, how, and with what precautions to integrate it into existing architectures.\nHow it works #The AI Manager answers questions no model can answer: where does AI generate real value and where does it generate only enthusiasm? How much does it cost to maintain, not just implement? What happens when the model is wrong? How does it integrate with mission-critical architectures without compromising stability?\nWhat it\u0026rsquo;s for #It separates signal from noise. In a market where every vendor promises triple-digit ROI, the AI Manager identifies the three areas where AI generates concrete value: analysis acceleration, decision noise reduction, and automated knowledge transfer. Everything else is shiny demos.\nWhy it matters #Without someone governing AI, organizations suffer it rather than leverage it. AI gets integrated without verifying training data provenance, without a fallback plan, without governance. In regulated environments (banking, public administration, healthcare) this is a risk that can cost far more than the AI itself.\n","date":"1 January 0001","permalink":"https://ivanluminaria.com/en/glossary/ai-manager/","section":"Glossary","summary":"\u003cp\u003eThe \u003cstrong\u003eAI Manager\u003c/strong\u003e is the professional role that governs the introduction and use of artificial intelligence within a project or organization. They are not the one who uses AI — they are the one who decides where, how, and with what precautions to integrate it into existing architectures.\u003c/p\u003e\n\u003ch2 id=\"how-it-works\" class=\"relative group\"\u003eHow it works \u003cspan class=\"absolute top-0 w-6 transition-opacity opacity-0 -start-6 not-prose group-hover:opacity-100\"\u003e\u003ca class=\"group-hover:text-primary-300 dark:group-hover:text-neutral-700\" style=\"text-decoration-line: none !important;\" href=\"#how-it-works\" aria-label=\"Anchor\"\u003e#\u003c/a\u003e\u003c/span\u003e\u003c/h2\u003e\u003cp\u003eThe AI Manager answers questions no model can answer: where does AI generate real value and where does it generate only enthusiasm? How much does it cost to maintain, not just implement? What happens when the model is wrong? How does it integrate with mission-critical architectures without compromising stability?\u003c/p\u003e","title":"AI Manager"},{"content":"ANALYZE is the PostgreSQL command that collects statistics about data distribution in tables and stores them in the pg_statistic catalog (readable through the pg_stats view). The optimizer uses these statistics to estimate cardinality — how many rows each operation will return — and choose the most efficient execution plan.\nWhat it collects #The statistics collected by ANALYZE include:\nMost common values: the most frequent values for each column and their percentage Distribution histograms: how the remaining values are distributed Number of distinct values: how many unique values each column has NULL percentage: how many rows have NULL for each column The quality of these statistics depends on the number of samples collected, controlled by the default_statistics_target parameter.\nWhy it matters #Without up-to-date statistics, the optimizer is forced to guess. Wrong estimates lead to disastrous execution plans — such as choosing a nested loop on millions of rows thinking there are only hundreds, or ignoring a perfectly suitable index.\nWhen to run it #PostgreSQL runs ANALYZE automatically through autovacuum, but the default threshold (50 rows + 10% of live rows) can be too high for rapidly growing tables. Situations where a manual ANALYZE is needed:\nAfter bulk imports or bulk loads After significant changes in data distribution When EXPLAIN ANALYZE shows cardinality estimates far from actual rows After modifying a column\u0026rsquo;s default_statistics_target ","date":"1 January 0001","permalink":"https://ivanluminaria.com/en/glossary/postgresql-analyze/","section":"Glossary","summary":"\u003cp\u003e\u003cstrong\u003eANALYZE\u003c/strong\u003e is the PostgreSQL command that collects statistics about data distribution in tables and stores them in the \u003ccode\u003epg_statistic\u003c/code\u003e catalog (readable through the \u003ccode\u003epg_stats\u003c/code\u003e view). The optimizer uses these statistics to estimate cardinality — how many rows each operation will return — and choose the most efficient execution plan.\u003c/p\u003e\n\u003ch2 id=\"what-it-collects\" class=\"relative group\"\u003eWhat it collects \u003cspan class=\"absolute top-0 w-6 transition-opacity opacity-0 -start-6 not-prose group-hover:opacity-100\"\u003e\u003ca class=\"group-hover:text-primary-300 dark:group-hover:text-neutral-700\" style=\"text-decoration-line: none !important;\" href=\"#what-it-collects\" aria-label=\"Anchor\"\u003e#\u003c/a\u003e\u003c/span\u003e\u003c/h2\u003e\u003cp\u003eThe statistics collected by ANALYZE include:\u003c/p\u003e\n\u003cul\u003e\n\u003cli\u003e\u003cstrong\u003eMost common values\u003c/strong\u003e: the most frequent values for each column and their percentage\u003c/li\u003e\n\u003cli\u003e\u003cstrong\u003eDistribution histograms\u003c/strong\u003e: how the remaining values are distributed\u003c/li\u003e\n\u003cli\u003e\u003cstrong\u003eNumber of distinct values\u003c/strong\u003e: how many unique values each column has\u003c/li\u003e\n\u003cli\u003e\u003cstrong\u003eNULL percentage\u003c/strong\u003e: how many rows have NULL for each column\u003c/li\u003e\n\u003c/ul\u003e\n\u003cp\u003eThe quality of these statistics depends on the number of samples collected, controlled by the \u003ccode\u003edefault_statistics_target\u003c/code\u003e parameter.\u003c/p\u003e","title":"ANALYZE"},{"content":"The Anonymous User is a MySQL/MariaDB account with an empty username (''@'localhost') that is automatically created during installation. It has no name and often no password.\nHow it works #When a user connects, MySQL looks for the most specific match in the mysql.user table. The anonymous user ''@'localhost' is more specific than 'mario'@'%' for a connection from localhost, because 'localhost' beats '%' in the specificity hierarchy. Consequently, Mario connecting locally gets authenticated as the anonymous user and loses all his privileges.\nWhat it\u0026rsquo;s for #The anonymous user was intended for development installations where connections without credentials were desired. In production it serves no purpose and represents a security risk: it can capture connections intended for other users and grant unauthorised access.\nWhen to use it #Never in production. The first operation on any production MySQL/MariaDB installation is to check for and remove anonymous users with SELECT user, host FROM mysql.user WHERE user = '' followed by DROP USER ''@'localhost'.\n","date":"1 January 0001","permalink":"https://ivanluminaria.com/en/glossary/anonymous-user/","section":"Glossary","summary":"\u003cp\u003eThe \u003cstrong\u003eAnonymous User\u003c/strong\u003e is a MySQL/MariaDB account with an empty username (\u003ccode\u003e''@'localhost'\u003c/code\u003e) that is automatically created during installation. It has no name and often no password.\u003c/p\u003e\n\u003ch2 id=\"how-it-works\" class=\"relative group\"\u003eHow it works \u003cspan class=\"absolute top-0 w-6 transition-opacity opacity-0 -start-6 not-prose group-hover:opacity-100\"\u003e\u003ca class=\"group-hover:text-primary-300 dark:group-hover:text-neutral-700\" style=\"text-decoration-line: none !important;\" href=\"#how-it-works\" aria-label=\"Anchor\"\u003e#\u003c/a\u003e\u003c/span\u003e\u003c/h2\u003e\u003cp\u003eWhen a user connects, MySQL looks for the most specific match in the \u003ccode\u003emysql.user\u003c/code\u003e table. The anonymous user \u003ccode\u003e''@'localhost'\u003c/code\u003e is more specific than \u003ccode\u003e'mario'@'%'\u003c/code\u003e for a connection from localhost, because \u003ccode\u003e'localhost'\u003c/code\u003e beats \u003ccode\u003e'%'\u003c/code\u003e in the specificity hierarchy. Consequently, Mario connecting locally gets authenticated as the anonymous user and loses all his privileges.\u003c/p\u003e","title":"Anonymous User"},{"content":"ASH (Active Session History) is an Oracle Database component that samples the state of every active session once per second and stores the data in an in-memory circular buffer (the V$ACTIVE_SESSION_HISTORY view).\nHow it works #Every second Oracle records for each active session:\nCurrently executing SQL (SQL_ID) Current wait event Calling program and module Execution plan in use (SQL_PLAN_HASH_VALUE) Older data is automatically flushed to AWR tables (DBA_HIST_ACTIVE_SESS_HISTORY) and retained for the configured period.\nWhat it\u0026rsquo;s for #ASH is the DBA\u0026rsquo;s microscope: where AWR shows averages over hourly intervals, ASH lets you reconstruct what a single session was doing at a precise moment. It is the ideal tool for:\nIdentifying who is running a problematic SQL Understanding exactly when a problem started (to the second) Correlating sessions, programs and wait events in real time When to use it #Use it when the AWR report has already identified a dominant SQL or wait event and you need detail: which session, which program, at what exact time. The rule of thumb: AWR to understand what changed, ASH to understand why.\n","date":"1 January 0001","permalink":"https://ivanluminaria.com/en/glossary/ash/","section":"Glossary","summary":"\u003cp\u003e\u003cstrong\u003eASH\u003c/strong\u003e (Active Session History) is an Oracle Database component that samples the state of every active session once per second and stores the data in an in-memory circular buffer (the \u003ccode\u003eV$ACTIVE_SESSION_HISTORY\u003c/code\u003e view).\u003c/p\u003e\n\u003ch2 id=\"how-it-works\" class=\"relative group\"\u003eHow it works \u003cspan class=\"absolute top-0 w-6 transition-opacity opacity-0 -start-6 not-prose group-hover:opacity-100\"\u003e\u003ca class=\"group-hover:text-primary-300 dark:group-hover:text-neutral-700\" style=\"text-decoration-line: none !important;\" href=\"#how-it-works\" aria-label=\"Anchor\"\u003e#\u003c/a\u003e\u003c/span\u003e\u003c/h2\u003e\u003cp\u003eEvery second Oracle records for each active session:\u003c/p\u003e\n\u003cul\u003e\n\u003cli\u003eCurrently executing SQL (\u003ccode\u003eSQL_ID\u003c/code\u003e)\u003c/li\u003e\n\u003cli\u003eCurrent wait event\u003c/li\u003e\n\u003cli\u003eCalling program and module\u003c/li\u003e\n\u003cli\u003eExecution plan in use (\u003ccode\u003eSQL_PLAN_HASH_VALUE\u003c/code\u003e)\u003c/li\u003e\n\u003c/ul\u003e\n\u003cp\u003eOlder data is automatically flushed to AWR tables (\u003ccode\u003eDBA_HIST_ACTIVE_SESS_HISTORY\u003c/code\u003e) and retained for the configured period.\u003c/p\u003e","title":"ASH"},{"content":"An Authentication Plugin is the module that MySQL or MariaDB uses to verify a user\u0026rsquo;s credentials at connection time. Every user in the system is associated with a specific plugin that determines how the password is hashed, transmitted and verified.\nHow it works #The main plugins are: mysql_native_password (default in MySQL 5.7 and MariaDB), which uses a double SHA1 hash; caching_sha2_password (default in MySQL 8.0+), which uses SHA-256 with caching to improve security and performance. When a client connects, it must support the plugin of the user it\u0026rsquo;s trying to authenticate to.\nWhat it\u0026rsquo;s for #Knowledge of authentication plugins is essential during migrations between versions or between MySQL and MariaDB. A client that only supports mysql_native_password cannot connect to a user with caching_sha2_password — and the resulting error is often cryptic and hard to diagnose.\nWhen to use it #The plugin is specified at user creation time (CREATE USER ... IDENTIFIED WITH \u0026lt;plugin\u0026gt; BY 'password') or can be checked and changed with ALTER USER. When writing provisioning scripts that must work across different MySQL/MariaDB versions, it\u0026rsquo;s important to explicitly specify the plugin.\n","date":"1 January 0001","permalink":"https://ivanluminaria.com/en/glossary/authentication-plugin/","section":"Glossary","summary":"\u003cp\u003eAn \u003cstrong\u003eAuthentication Plugin\u003c/strong\u003e is the module that MySQL or MariaDB uses to verify a user\u0026rsquo;s credentials at connection time. Every user in the system is associated with a specific plugin that determines how the password is hashed, transmitted and verified.\u003c/p\u003e\n\u003ch2 id=\"how-it-works\" class=\"relative group\"\u003eHow it works \u003cspan class=\"absolute top-0 w-6 transition-opacity opacity-0 -start-6 not-prose group-hover:opacity-100\"\u003e\u003ca class=\"group-hover:text-primary-300 dark:group-hover:text-neutral-700\" style=\"text-decoration-line: none !important;\" href=\"#how-it-works\" aria-label=\"Anchor\"\u003e#\u003c/a\u003e\u003c/span\u003e\u003c/h2\u003e\u003cp\u003eThe main plugins are: \u003ccode\u003emysql_native_password\u003c/code\u003e (default in MySQL 5.7 and MariaDB), which uses a double SHA1 hash; \u003ccode\u003ecaching_sha2_password\u003c/code\u003e (default in MySQL 8.0+), which uses SHA-256 with caching to improve security and performance. When a client connects, it must support the plugin of the user it\u0026rsquo;s trying to authenticate to.\u003c/p\u003e","title":"Authentication Plugin"},{"content":"Autovacuum is a PostgreSQL daemon that automatically runs VACUUM and ANALYZE on tables when the number of dead tuples exceeds a threshold calculated as: threshold + scale_factor × n_live_tup. With defaults (threshold=50, scale_factor=0.2), on a table with 10 million rows it triggers after 2 million dead tuples.\nHow it works #The daemon periodically checks pg_stat_user_tables and launches a worker for each table exceeding the threshold. The maximum number of simultaneous workers is controlled by autovacuum_max_workers (default 3). The autovacuum_vacuum_cost_delay parameter controls how much vacuum throttles itself to avoid overloading I/O.\nWhat it\u0026rsquo;s for #It is the silent custodian that prevents tables from bloating due to dead tuple accumulation. It should never be disabled — that is the worst thing you can do to a production PostgreSQL. It should be configured per-table: high-traffic tables need low scale_factors (0.01-0.05) and reduced cost_delay.\nWhat can go wrong #With defaults, autovacuum is too conservative for high-traffic tables. 3 workers for dozens of active tables aren\u0026rsquo;t enough. A 20% scale_factor on large tables generates millions of dead tuples before intervention. Per-table tuning with ALTER TABLE ... SET (autovacuum_vacuum_scale_factor = 0.01) is essential.\n","date":"1 January 0001","permalink":"https://ivanluminaria.com/en/glossary/autovacuum/","section":"Glossary","summary":"\u003cp\u003e\u003cstrong\u003eAutovacuum\u003c/strong\u003e is a PostgreSQL daemon that automatically runs VACUUM and ANALYZE on tables when the number of dead tuples exceeds a threshold calculated as: \u003ccode\u003ethreshold + scale_factor × n_live_tup\u003c/code\u003e. With defaults (threshold=50, scale_factor=0.2), on a table with 10 million rows it triggers after 2 million dead tuples.\u003c/p\u003e\n\u003ch2 id=\"how-it-works\" class=\"relative group\"\u003eHow it works \u003cspan class=\"absolute top-0 w-6 transition-opacity opacity-0 -start-6 not-prose group-hover:opacity-100\"\u003e\u003ca class=\"group-hover:text-primary-300 dark:group-hover:text-neutral-700\" style=\"text-decoration-line: none !important;\" href=\"#how-it-works\" aria-label=\"Anchor\"\u003e#\u003c/a\u003e\u003c/span\u003e\u003c/h2\u003e\u003cp\u003eThe daemon periodically checks \u003ccode\u003epg_stat_user_tables\u003c/code\u003e and launches a worker for each table exceeding the threshold. The maximum number of simultaneous workers is controlled by \u003ccode\u003eautovacuum_max_workers\u003c/code\u003e (default 3). The \u003ccode\u003eautovacuum_vacuum_cost_delay\u003c/code\u003e parameter controls how much vacuum throttles itself to avoid overloading I/O.\u003c/p\u003e","title":"Autovacuum"},{"content":"AWR (Automatic Workload Repository) is a built-in Oracle Database component that automatically collects system performance statistics at regular intervals (every 60 minutes by default) and retains them for a configurable period.\nHow it works #AWR captures periodic snapshots that include:\nSession statistics and wait events SQL metrics (top SQL by execution time, I/O, CPU) Memory structure statistics (SGA, PGA) I/O statistics by datafile and tablespace What it\u0026rsquo;s for #The AWR report is the primary tool for diagnosing performance issues in Oracle. By comparing two snapshots you can identify:\nQueries consuming excessive resources Changes in execution plans I/O, CPU or memory bottlenecks Performance regressions after application deployments When to use it #AWR is the first tool to consult when you receive a slowness report. Together with ASH (Active Session History), it lets you reconstruct what happened in the database during a specific time window, even after the problem has resolved.\n","date":"1 January 0001","permalink":"https://ivanluminaria.com/en/glossary/awr/","section":"Glossary","summary":"\u003cp\u003e\u003cstrong\u003eAWR\u003c/strong\u003e (Automatic Workload Repository) is a built-in Oracle Database component that automatically collects system performance statistics at regular intervals (every 60 minutes by default) and retains them for a configurable period.\u003c/p\u003e\n\u003ch2 id=\"how-it-works\" class=\"relative group\"\u003eHow it works \u003cspan class=\"absolute top-0 w-6 transition-opacity opacity-0 -start-6 not-prose group-hover:opacity-100\"\u003e\u003ca class=\"group-hover:text-primary-300 dark:group-hover:text-neutral-700\" style=\"text-decoration-line: none !important;\" href=\"#how-it-works\" aria-label=\"Anchor\"\u003e#\u003c/a\u003e\u003c/span\u003e\u003c/h2\u003e\u003cp\u003eAWR captures periodic snapshots that include:\u003c/p\u003e\n\u003cul\u003e\n\u003cli\u003eSession statistics and wait events\u003c/li\u003e\n\u003cli\u003eSQL metrics (top SQL by execution time, I/O, CPU)\u003c/li\u003e\n\u003cli\u003eMemory structure statistics (SGA, PGA)\u003c/li\u003e\n\u003cli\u003eI/O statistics by datafile and tablespace\u003c/li\u003e\n\u003c/ul\u003e\n\u003ch2 id=\"what-its-for\" class=\"relative group\"\u003eWhat it\u0026rsquo;s for \u003cspan class=\"absolute top-0 w-6 transition-opacity opacity-0 -start-6 not-prose group-hover:opacity-100\"\u003e\u003ca class=\"group-hover:text-primary-300 dark:group-hover:text-neutral-700\" style=\"text-decoration-line: none !important;\" href=\"#what-its-for\" aria-label=\"Anchor\"\u003e#\u003c/a\u003e\u003c/span\u003e\u003c/h2\u003e\u003cp\u003eThe AWR report is the primary tool for diagnosing performance issues in Oracle. By comparing two snapshots you can identify:\u003c/p\u003e","title":"AWR"},{"content":"B-Tree (Balanced Tree) is the most common data structure for indexes in relational databases and is the default index type in PostgreSQL, MySQL and Oracle. It maintains data sorted in a balanced tree structure that guarantees logarithmic search times.\nHow it works #A B-Tree organises keys in sorted nodes, with each node containing pointers to child nodes. Search starts from the root and descends to the leaves, halving the search space at each level. For a table with 6 million rows, a B-Tree typically requires 3-4 levels of depth, meaning 3-4 page reads to find a value.\nWhat it\u0026rsquo;s for #B-Trees are optimal for equality searches (WHERE col = 'value'), ranges (WHERE col BETWEEN x AND y), sorting and prefix searches (LIKE 'ABC%'). They cannot however be used for searches with a leading wildcard (LIKE '%ABC%'), because the B-Tree ordering doesn\u0026rsquo;t help find substrings at arbitrary positions.\nWhen to use it #B-Tree is the right choice for most indexes. When a \u0026ldquo;contains\u0026rdquo; search on text is needed, you must switch to a GIN index with the pg_trgm extension. The choice between B-Tree and GIN depends on the query type and the table\u0026rsquo;s workload profile.\n","date":"1 January 0001","permalink":"https://ivanluminaria.com/en/glossary/b-tree/","section":"Glossary","summary":"\u003cp\u003e\u003cstrong\u003eB-Tree\u003c/strong\u003e (Balanced Tree) is the most common data structure for indexes in relational databases and is the default index type in PostgreSQL, MySQL and Oracle. It maintains data sorted in a balanced tree structure that guarantees logarithmic search times.\u003c/p\u003e\n\u003ch2 id=\"how-it-works\" class=\"relative group\"\u003eHow it works \u003cspan class=\"absolute top-0 w-6 transition-opacity opacity-0 -start-6 not-prose group-hover:opacity-100\"\u003e\u003ca class=\"group-hover:text-primary-300 dark:group-hover:text-neutral-700\" style=\"text-decoration-line: none !important;\" href=\"#how-it-works\" aria-label=\"Anchor\"\u003e#\u003c/a\u003e\u003c/span\u003e\u003c/h2\u003e\u003cp\u003eA B-Tree organises keys in sorted nodes, with each node containing pointers to child nodes. Search starts from the root and descends to the leaves, halving the search space at each level. For a table with 6 million rows, a B-Tree typically requires 3-4 levels of depth, meaning 3-4 page reads to find a value.\u003c/p\u003e","title":"B-Tree"},{"content":"The binary log (or binlog) is a sequential binary-format record where MySQL writes all events that modify data: INSERT, UPDATE, DELETE and DDL operations. Files are numbered progressively (mysql-bin.000001, mysql-bin.000002, etc.) and managed through an index file.\nHow it works #From MySQL 8.0, binary logging is enabled by default via the log_bin parameter. MySQL creates a new binlog file when the server starts, when the current file reaches max_binlog_size, or when FLUSH BINARY LOGS is executed. It supports three recording formats: STATEMENT (records SQL statements), ROW (records row-level changes) and MIXED (automatic choice).\nWhat it\u0026rsquo;s for #The binary log serves two fundamental purposes:\nReplication: in a master-slave architecture, the slave reads the master\u0026rsquo;s binlogs to replicate the same operations Point-in-time recovery: after restoring a backup, binlogs allow replaying changes up to a precise moment When to use it #Binary logging is active by default on any MySQL 8.0+ installation. Active management (retention, purge, space monitoring) is necessary to prevent accumulated files from filling up the disk. The PURGE BINARY LOGS command is the correct way to remove obsolete files.\n","date":"1 January 0001","permalink":"https://ivanluminaria.com/en/glossary/binary-log/","section":"Glossary","summary":"\u003cp\u003eThe \u003cstrong\u003ebinary log\u003c/strong\u003e (or binlog) is a sequential binary-format record where MySQL writes all events that modify data: INSERT, UPDATE, DELETE and DDL operations. Files are numbered progressively (\u003ccode\u003emysql-bin.000001\u003c/code\u003e, \u003ccode\u003emysql-bin.000002\u003c/code\u003e, etc.) and managed through an index file.\u003c/p\u003e\n\u003ch2 id=\"how-it-works\" class=\"relative group\"\u003eHow it works \u003cspan class=\"absolute top-0 w-6 transition-opacity opacity-0 -start-6 not-prose group-hover:opacity-100\"\u003e\u003ca class=\"group-hover:text-primary-300 dark:group-hover:text-neutral-700\" style=\"text-decoration-line: none !important;\" href=\"#how-it-works\" aria-label=\"Anchor\"\u003e#\u003c/a\u003e\u003c/span\u003e\u003c/h2\u003e\u003cp\u003eFrom MySQL 8.0, binary logging is enabled by default via the \u003ccode\u003elog_bin\u003c/code\u003e parameter. MySQL creates a new binlog file when the server starts, when the current file reaches \u003ccode\u003emax_binlog_size\u003c/code\u003e, or when \u003ccode\u003eFLUSH BINARY LOGS\u003c/code\u003e is executed. It supports three recording formats: STATEMENT (records SQL statements), ROW (records row-level changes) and MIXED (automatic choice).\u003c/p\u003e","title":"Binary log"},{"content":"Bloat is the accumulation of dead space within a PostgreSQL table or index, caused by dead tuples not yet removed by VACUUM. A table with 50% bloat occupies twice the necessary space and forces sequential scans to read twice as many pages.\nHow it works #Bloat is measured by comparing the actual table size with the expected size based on live rows. The pgstattuple extension provides the dead_tuple_percent field. Bloat above 20-30% is a warning sign; above 50% is an emergency.\nWhat it\u0026rsquo;s for #Monitoring bloat is essential to understand whether autovacuum is keeping pace. The pg_stat_user_tables query with n_dead_tup and last_autovacuum is the first diagnostic tool. If bloat is out of control, pg_repack rebuilds the table online without prolonged exclusive locks — unlike VACUUM FULL.\nWhat can go wrong #Normal VACUUM reclaims dead tuple space but doesn\u0026rsquo;t compact the table — fragmented space remains. If bloat reaches 50-70%, VACUUM alone isn\u0026rsquo;t enough. The options are VACUUM FULL (exclusive lock, blocks everything) or pg_repack (online, but requires installation). The real solution is not getting there, with a well-configured autovacuum.\n","date":"1 January 0001","permalink":"https://ivanluminaria.com/en/glossary/bloat/","section":"Glossary","summary":"\u003cp\u003e\u003cstrong\u003eBloat\u003c/strong\u003e is the accumulation of dead space within a PostgreSQL table or index, caused by dead tuples not yet removed by VACUUM. A table with 50% bloat occupies twice the necessary space and forces sequential scans to read twice as many pages.\u003c/p\u003e\n\u003ch2 id=\"how-it-works\" class=\"relative group\"\u003eHow it works \u003cspan class=\"absolute top-0 w-6 transition-opacity opacity-0 -start-6 not-prose group-hover:opacity-100\"\u003e\u003ca class=\"group-hover:text-primary-300 dark:group-hover:text-neutral-700\" style=\"text-decoration-line: none !important;\" href=\"#how-it-works\" aria-label=\"Anchor\"\u003e#\u003c/a\u003e\u003c/span\u003e\u003c/h2\u003e\u003cp\u003eBloat is measured by comparing the actual table size with the expected size based on live rows. The \u003ccode\u003epgstattuple\u003c/code\u003e extension provides the \u003ccode\u003edead_tuple_percent\u003c/code\u003e field. Bloat above 20-30% is a warning sign; above 50% is an emergency.\u003c/p\u003e","title":"Bloat"},{"content":"A Branch is an independent development line in a Git repository. Each branch contains a copy of the code that can be worked on without affecting the main branch or other developers\u0026rsquo; work.\nHow it works #When a developer creates a branch (e.g. fix/issue-234-calculation-error), Git creates a pointer to the current code version. From that point, changes made on the branch remain isolated. When work is complete, changes are proposed to the team via Pull Request and, after approval, merged into the main branch.\nWhat it\u0026rsquo;s for #Branches eliminate the problem of accidental overwrites and unmanaged conflicts. Every developer works in their own isolated area: they don\u0026rsquo;t overwrite others\u0026rsquo; work and don\u0026rsquo;t break working code. The main branch always stays in a \u0026ldquo;good\u0026rdquo; state because it only receives approved code.\nWhen to use it #A branch is created for every task, bug fix or feature. Naming conventions help identify the purpose: fix/ for bugs, feature/ for new features, hotfix/ for urgent fixes. The branch is deleted after merge to keep the repository clean.\n","date":"1 January 0001","permalink":"https://ivanluminaria.com/en/glossary/branch/","section":"Glossary","summary":"\u003cp\u003eA \u003cstrong\u003eBranch\u003c/strong\u003e is an independent development line in a Git repository. Each branch contains a copy of the code that can be worked on without affecting the main branch or other developers\u0026rsquo; work.\u003c/p\u003e\n\u003ch2 id=\"how-it-works\" class=\"relative group\"\u003eHow it works \u003cspan class=\"absolute top-0 w-6 transition-opacity opacity-0 -start-6 not-prose group-hover:opacity-100\"\u003e\u003ca class=\"group-hover:text-primary-300 dark:group-hover:text-neutral-700\" style=\"text-decoration-line: none !important;\" href=\"#how-it-works\" aria-label=\"Anchor\"\u003e#\u003c/a\u003e\u003c/span\u003e\u003c/h2\u003e\u003cp\u003eWhen a developer creates a branch (e.g. \u003ccode\u003efix/issue-234-calculation-error\u003c/code\u003e), Git creates a pointer to the current code version. From that point, changes made on the branch remain isolated. When work is complete, changes are proposed to the team via Pull Request and, after approval, merged into the main branch.\u003c/p\u003e","title":"Branch"},{"content":"The Brompton is a folding bicycle made in London since 1975, considered the world reference in its category. It folds in 10-20 seconds to approximately 58×56×27 cm — compact enough to fit under a desk or in a small car\u0026rsquo;s trunk.\nHow it works #The patented mechanism allows folding the bike in three moves: frame, handlebars, and saddle. In the electric version (Brompton Electric), a hub motor provides pedal assist up to 25 km/h with a range of 40-70 km. The battery is removable and charges in 4 hours.\nWhat it\u0026rsquo;s for #It is the ideal solution for multimodal commuting: ride to the station, fold, board the metro or train, unfold, ride to the office. Once there, store it under the desk. Zero parking, zero theft, zero constraints.\nWhy it matters #In the direct car vs Brompton comparison in Rome (Appio Latino → Prati), the Brompton takes 18 minutes versus 50 by car plus 90 minutes of parking. Daily cost: €0 versus €35. The Brompton pays for itself in less than a year of parking savings alone.\n","date":"1 January 0001","permalink":"https://ivanluminaria.com/en/glossary/brompton/","section":"Glossary","summary":"\u003cp\u003eThe \u003cstrong\u003eBrompton\u003c/strong\u003e is a folding bicycle made in London since 1975, considered the world reference in its category. It folds in 10-20 seconds to approximately 58×56×27 cm — compact enough to fit under a desk or in a small car\u0026rsquo;s trunk.\u003c/p\u003e\n\u003ch2 id=\"how-it-works\" class=\"relative group\"\u003eHow it works \u003cspan class=\"absolute top-0 w-6 transition-opacity opacity-0 -start-6 not-prose group-hover:opacity-100\"\u003e\u003ca class=\"group-hover:text-primary-300 dark:group-hover:text-neutral-700\" style=\"text-decoration-line: none !important;\" href=\"#how-it-works\" aria-label=\"Anchor\"\u003e#\u003c/a\u003e\u003c/span\u003e\u003c/h2\u003e\u003cp\u003eThe patented mechanism allows folding the bike in three moves: frame, handlebars, and saddle. In the electric version (Brompton Electric), a hub motor provides pedal assist up to 25 km/h with a range of 40-70 km. The battery is removable and charges in 4 hours.\u003c/p\u003e","title":"Brompton"},{"content":"BYOL (Bring Your Own License) is an Oracle program that allows organizations to transfer software licenses purchased for on-premises infrastructure to Oracle Cloud Infrastructure (OCI), without having to purchase new cloud licenses.\nHow it works #When an organization already owns Oracle licenses — typically Enterprise Edition with options like RAC, Data Guard or Partitioning — it can \u0026ldquo;bring them along\u0026rdquo; in the migration to OCI. The support contract (Software Update License \u0026amp; Support) is maintained, and the licenses are associated with cloud resources instead of physical servers.\nOn OCI, each OCPU corresponds to one processor license, with a transparent 1:1 ratio. This makes the calculation predictable and compliant with Oracle licensing policies.\nWhy it matters in migrations #BYOL is often the decisive factor in choosing OCI over other cloud providers. On AWS or Azure, Oracle applies different licensing rules: each vCPU counts as half a processor, and options like RAC are either unsupported or require additional licenses. An Oracle audit on a non-OCI cloud can turn an apparent saving into a very significant unexpected cost.\nWhat it covers # Oracle Database (all editions) Database options (RAC, Data Guard, Partitioning, Advanced Compression, etc.) Oracle Middleware and other Oracle products with eligible licenses BYOL is not automatic: it must be requested and configured when provisioning OCI resources, specifying the existing licenses in the contract.\n","date":"1 January 0001","permalink":"https://ivanluminaria.com/en/glossary/byol/","section":"Glossary","summary":"\u003cp\u003e\u003cstrong\u003eBYOL\u003c/strong\u003e (Bring Your Own License) is an Oracle program that allows organizations to transfer software licenses purchased for on-premises infrastructure to Oracle Cloud Infrastructure (OCI), without having to purchase new cloud licenses.\u003c/p\u003e\n\u003ch2 id=\"how-it-works\" class=\"relative group\"\u003eHow it works \u003cspan class=\"absolute top-0 w-6 transition-opacity opacity-0 -start-6 not-prose group-hover:opacity-100\"\u003e\u003ca class=\"group-hover:text-primary-300 dark:group-hover:text-neutral-700\" style=\"text-decoration-line: none !important;\" href=\"#how-it-works\" aria-label=\"Anchor\"\u003e#\u003c/a\u003e\u003c/span\u003e\u003c/h2\u003e\u003cp\u003eWhen an organization already owns Oracle licenses — typically Enterprise Edition with options like RAC, Data Guard or Partitioning — it can \u0026ldquo;bring them along\u0026rdquo; in the migration to OCI. The support contract (Software Update License \u0026amp; Support) is maintained, and the licenses are associated with cloud resources instead of physical servers.\u003c/p\u003e","title":"BYOL"},{"content":"The Carbon Footprint is the total amount of greenhouse gases — primarily CO₂ — emitted directly or indirectly by an activity, product, or individual, expressed in tonnes of CO₂ equivalent.\nHow it works #For urban commuting, the calculation is direct: a car stuck in Rome traffic produces an average of 120-150 g of CO₂ per kilometer. In congested traffic even more, because the engine idles consuming fuel without moving. A bicycle produces zero direct emissions.\nWhat it\u0026rsquo;s for #It quantifies the environmental impact of mobility choices. If just 10% of Roman commuters switched to cycling, approximately 150,000 tonnes of CO₂ would be saved per year — equivalent to planting 7 million trees. It\u0026rsquo;s not idealism, it\u0026rsquo;s arithmetic.\nWhy it matters #The commuting carbon footprint is an externalized cost that nobody pays directly but everyone suffers: air pollution, climate change, healthcare costs. The choice between car and bike is not just personal — it has a measurable collective impact.\n","date":"1 January 0001","permalink":"https://ivanluminaria.com/en/glossary/carbon-footprint/","section":"Glossary","summary":"\u003cp\u003eThe \u003cstrong\u003eCarbon Footprint\u003c/strong\u003e is the total amount of greenhouse gases — primarily CO₂ — emitted directly or indirectly by an activity, product, or individual, expressed in tonnes of CO₂ equivalent.\u003c/p\u003e\n\u003ch2 id=\"how-it-works\" class=\"relative group\"\u003eHow it works \u003cspan class=\"absolute top-0 w-6 transition-opacity opacity-0 -start-6 not-prose group-hover:opacity-100\"\u003e\u003ca class=\"group-hover:text-primary-300 dark:group-hover:text-neutral-700\" style=\"text-decoration-line: none !important;\" href=\"#how-it-works\" aria-label=\"Anchor\"\u003e#\u003c/a\u003e\u003c/span\u003e\u003c/h2\u003e\u003cp\u003eFor urban commuting, the calculation is direct: a car stuck in Rome traffic produces an average of 120-150 g of CO₂ per kilometer. In congested traffic even more, because the engine idles consuming fuel without moving. A bicycle produces zero direct emissions.\u003c/p\u003e","title":"Carbon Footprint"},{"content":"CDC (Change Data Capture) is a technique for intercepting data changes (INSERT, UPDATE, DELETE) as they occur and propagating them to other systems in real time or near-real time. Unlike traditional batch approaches (periodic ETL), CDC captures changes continuously and incrementally.\nHow it works #The most common approach is log-based CDC: an external component reads the database\u0026rsquo;s transaction logs (binary log in MySQL, WAL in PostgreSQL, redo log in Oracle) and converts events into a data stream consumable by other systems. Tools like Debezium, Maxwell and Canal implement this approach for MySQL by reading binary logs directly.\nWhat it\u0026rsquo;s for #CDC is used for:\nSynchronising data between different databases in real time Feeding data warehouses and data lakes with incremental updates Populating caches and search indexes (Elasticsearch, Redis) Implementing event-driven architectures and microservices When to use it #CDC requires binary logging to be active and in ROW format (which records row-level changes). Disabling binary logs or using STATEMENT format eliminates the ability to use CDC tools, making real-time integration with external systems impossible.\n","date":"1 January 0001","permalink":"https://ivanluminaria.com/en/glossary/cdc/","section":"Glossary","summary":"\u003cp\u003e\u003cstrong\u003eCDC\u003c/strong\u003e (Change Data Capture) is a technique for intercepting data changes (INSERT, UPDATE, DELETE) as they occur and propagating them to other systems in real time or near-real time. Unlike traditional batch approaches (periodic ETL), CDC captures changes continuously and incrementally.\u003c/p\u003e\n\u003ch2 id=\"how-it-works\" class=\"relative group\"\u003eHow it works \u003cspan class=\"absolute top-0 w-6 transition-opacity opacity-0 -start-6 not-prose group-hover:opacity-100\"\u003e\u003ca class=\"group-hover:text-primary-300 dark:group-hover:text-neutral-700\" style=\"text-decoration-line: none !important;\" href=\"#how-it-works\" aria-label=\"Anchor\"\u003e#\u003c/a\u003e\u003c/span\u003e\u003c/h2\u003e\u003cp\u003eThe most common approach is \u003cstrong\u003elog-based CDC\u003c/strong\u003e: an external component reads the database\u0026rsquo;s transaction logs (binary log in MySQL, WAL in PostgreSQL, redo log in Oracle) and converts events into a data stream consumable by other systems. Tools like Debezium, Maxwell and Canal implement this approach for MySQL by reading binary logs directly.\u003c/p\u003e","title":"CDC"},{"content":"A table\u0026rsquo;s churn is the measure of how much its data changes after insertion. A high-churn table undergoes frequent UPDATEs and DELETEs; a low-churn table is predominantly append-only (INSERT only).\nHow it works #In PostgreSQL, every UPDATE creates a new row version (due to the MVCC model) and the old version becomes a dead tuple. DELETEs also create dead tuples. The higher the churn, the more work VACUUM and indexes must do to maintain performance. A GIN index on a high-churn table can significantly degrade write performance.\nWhat it\u0026rsquo;s for #Evaluating churn before creating an index is essential to avoid solving a read problem by creating a write problem. On an append-only table (zero UPDATEs, zero DELETEs, zero dead tuples), a GIN index has minimal write impact. On a high-churn table, the same index could become a bottleneck.\nWhen to use it #Churn is analysed by checking table statistics: daily UPDATE and DELETE counts, dead tuples, VACUUM frequency. In PostgreSQL, pg_stat_user_tables provides these metrics. The decision to add a GIN or trigram index should always start from this analysis.\n","date":"1 January 0001","permalink":"https://ivanluminaria.com/en/glossary/churn/","section":"Glossary","summary":"\u003cp\u003eA table\u0026rsquo;s \u003cstrong\u003echurn\u003c/strong\u003e is the measure of how much its data changes after insertion. A high-churn table undergoes frequent UPDATEs and DELETEs; a low-churn table is predominantly append-only (INSERT only).\u003c/p\u003e\n\u003ch2 id=\"how-it-works\" class=\"relative group\"\u003eHow it works \u003cspan class=\"absolute top-0 w-6 transition-opacity opacity-0 -start-6 not-prose group-hover:opacity-100\"\u003e\u003ca class=\"group-hover:text-primary-300 dark:group-hover:text-neutral-700\" style=\"text-decoration-line: none !important;\" href=\"#how-it-works\" aria-label=\"Anchor\"\u003e#\u003c/a\u003e\u003c/span\u003e\u003c/h2\u003e\u003cp\u003eIn PostgreSQL, every UPDATE creates a new row version (due to the MVCC model) and the old version becomes a dead tuple. DELETEs also create dead tuples. The higher the churn, the more work VACUUM and indexes must do to maintain performance. A GIN index on a high-churn table can significantly degrade write performance.\u003c/p\u003e","title":"Churn"},{"content":"COALESCE is a standard SQL function that accepts a list of expressions and returns the first one that is not NULL. If all expressions are NULL, it returns NULL.\nSyntax #COALESCE(expression1, expression2, expression3, ...) It\u0026rsquo;s equivalent to a CASE WHEN chain:\nCASE WHEN expression1 IS NOT NULL THEN expression1 WHEN expression2 IS NOT NULL THEN expression2 WHEN expression3 IS NOT NULL THEN expression3 ELSE NULL END Use in hierarchies #In the context of ragged hierarchies, COALESCE is often used to fill missing levels:\nCOALESCE(top_group_name, group_name, client_name) AS top_group_name This works as a report workaround, but has significant limitations: it must be repeated in every query, it doesn\u0026rsquo;t distinguish original values from fallback ones, and it complicates the code.\nDatabase alternatives # Oracle: NVL(a, b) for two values, COALESCE for more than two MySQL: IFNULL(a, b) for two values, COALESCE for more than two PostgreSQL: COALESCE only (standard SQL) Recommended approach in the DWH #In a data warehouse, it\u0026rsquo;s better to use COALESCE in the ETL to populate the dimension table with NOT NULL values (self-parenting), rather than using it repeatedly in reports. NULL handling logic belongs in the model, not in the presentation layer.\n","date":"1 January 0001","permalink":"https://ivanluminaria.com/en/glossary/coalesce/","section":"Glossary","summary":"\u003cp\u003e\u003cstrong\u003eCOALESCE\u003c/strong\u003e is a standard SQL function that accepts a list of expressions and returns the first one that is not NULL. If all expressions are NULL, it returns NULL.\u003c/p\u003e\n\u003ch2 id=\"syntax\" class=\"relative group\"\u003eSyntax \u003cspan class=\"absolute top-0 w-6 transition-opacity opacity-0 -start-6 not-prose group-hover:opacity-100\"\u003e\u003ca class=\"group-hover:text-primary-300 dark:group-hover:text-neutral-700\" style=\"text-decoration-line: none !important;\" href=\"#syntax\" aria-label=\"Anchor\"\u003e#\u003c/a\u003e\u003c/span\u003e\u003c/h2\u003e\u003cdiv class=\"highlight\"\u003e\u003cpre tabindex=\"0\" class=\"chroma\"\u003e\u003ccode class=\"language-sql\" data-lang=\"sql\"\u003e\u003cspan class=\"line\"\u003e\u003cspan class=\"cl\"\u003e\u003cspan class=\"n\"\u003eCOALESCE\u003c/span\u003e\u003cspan class=\"p\"\u003e(\u003c/span\u003e\u003cspan class=\"n\"\u003eexpression1\u003c/span\u003e\u003cspan class=\"p\"\u003e,\u003c/span\u003e\u003cspan class=\"w\"\u003e \u003c/span\u003e\u003cspan class=\"n\"\u003eexpression2\u003c/span\u003e\u003cspan class=\"p\"\u003e,\u003c/span\u003e\u003cspan class=\"w\"\u003e \u003c/span\u003e\u003cspan class=\"n\"\u003eexpression3\u003c/span\u003e\u003cspan class=\"p\"\u003e,\u003c/span\u003e\u003cspan class=\"w\"\u003e \u003c/span\u003e\u003cspan class=\"p\"\u003e...)\u003c/span\u003e\u003cspan class=\"w\"\u003e\n\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/code\u003e\u003c/pre\u003e\u003c/div\u003e\u003cp\u003eIt\u0026rsquo;s equivalent to a CASE WHEN chain:\u003c/p\u003e\n\u003cdiv class=\"highlight\"\u003e\u003cpre tabindex=\"0\" class=\"chroma\"\u003e\u003ccode class=\"language-sql\" data-lang=\"sql\"\u003e\u003cspan class=\"line\"\u003e\u003cspan class=\"cl\"\u003e\u003cspan class=\"k\"\u003eCASE\u003c/span\u003e\u003cspan class=\"w\"\u003e \u003c/span\u003e\u003cspan class=\"k\"\u003eWHEN\u003c/span\u003e\u003cspan class=\"w\"\u003e \u003c/span\u003e\u003cspan class=\"n\"\u003eexpression1\u003c/span\u003e\u003cspan class=\"w\"\u003e \u003c/span\u003e\u003cspan class=\"k\"\u003eIS\u003c/span\u003e\u003cspan class=\"w\"\u003e \u003c/span\u003e\u003cspan class=\"k\"\u003eNOT\u003c/span\u003e\u003cspan class=\"w\"\u003e \u003c/span\u003e\u003cspan class=\"k\"\u003eNULL\u003c/span\u003e\u003cspan class=\"w\"\u003e \u003c/span\u003e\u003cspan class=\"k\"\u003eTHEN\u003c/span\u003e\u003cspan class=\"w\"\u003e \u003c/span\u003e\u003cspan class=\"n\"\u003eexpression1\u003c/span\u003e\u003cspan class=\"w\"\u003e\n\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003cspan class=\"line\"\u003e\u003cspan class=\"cl\"\u003e\u003cspan class=\"w\"\u003e     \u003c/span\u003e\u003cspan class=\"k\"\u003eWHEN\u003c/span\u003e\u003cspan class=\"w\"\u003e \u003c/span\u003e\u003cspan class=\"n\"\u003eexpression2\u003c/span\u003e\u003cspan class=\"w\"\u003e \u003c/span\u003e\u003cspan class=\"k\"\u003eIS\u003c/span\u003e\u003cspan class=\"w\"\u003e \u003c/span\u003e\u003cspan class=\"k\"\u003eNOT\u003c/span\u003e\u003cspan class=\"w\"\u003e \u003c/span\u003e\u003cspan class=\"k\"\u003eNULL\u003c/span\u003e\u003cspan class=\"w\"\u003e \u003c/span\u003e\u003cspan class=\"k\"\u003eTHEN\u003c/span\u003e\u003cspan class=\"w\"\u003e \u003c/span\u003e\u003cspan class=\"n\"\u003eexpression2\u003c/span\u003e\u003cspan class=\"w\"\u003e\n\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003cspan class=\"line\"\u003e\u003cspan class=\"cl\"\u003e\u003cspan class=\"w\"\u003e     \u003c/span\u003e\u003cspan class=\"k\"\u003eWHEN\u003c/span\u003e\u003cspan class=\"w\"\u003e \u003c/span\u003e\u003cspan class=\"n\"\u003eexpression3\u003c/span\u003e\u003cspan class=\"w\"\u003e \u003c/span\u003e\u003cspan class=\"k\"\u003eIS\u003c/span\u003e\u003cspan class=\"w\"\u003e \u003c/span\u003e\u003cspan class=\"k\"\u003eNOT\u003c/span\u003e\u003cspan class=\"w\"\u003e \u003c/span\u003e\u003cspan class=\"k\"\u003eNULL\u003c/span\u003e\u003cspan class=\"w\"\u003e \u003c/span\u003e\u003cspan class=\"k\"\u003eTHEN\u003c/span\u003e\u003cspan class=\"w\"\u003e \u003c/span\u003e\u003cspan class=\"n\"\u003eexpression3\u003c/span\u003e\u003cspan class=\"w\"\u003e\n\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003cspan class=\"line\"\u003e\u003cspan class=\"cl\"\u003e\u003cspan class=\"w\"\u003e     \u003c/span\u003e\u003cspan class=\"k\"\u003eELSE\u003c/span\u003e\u003cspan class=\"w\"\u003e \u003c/span\u003e\u003cspan class=\"k\"\u003eNULL\u003c/span\u003e\u003cspan class=\"w\"\u003e \u003c/span\u003e\u003cspan class=\"k\"\u003eEND\u003c/span\u003e\u003cspan class=\"w\"\u003e\n\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/code\u003e\u003c/pre\u003e\u003c/div\u003e\u003ch2 id=\"use-in-hierarchies\" class=\"relative group\"\u003eUse in hierarchies \u003cspan class=\"absolute top-0 w-6 transition-opacity opacity-0 -start-6 not-prose group-hover:opacity-100\"\u003e\u003ca class=\"group-hover:text-primary-300 dark:group-hover:text-neutral-700\" style=\"text-decoration-line: none !important;\" href=\"#use-in-hierarchies\" aria-label=\"Anchor\"\u003e#\u003c/a\u003e\u003c/span\u003e\u003c/h2\u003e\u003cp\u003eIn the context of ragged hierarchies, COALESCE is often used to fill missing levels:\u003c/p\u003e","title":"COALESCE"},{"content":"Code Review is the practice where a colleague examines code written by another developer before it is incorporated into the main branch. On GitHub it happens inside Pull Requests.\nHow it works #The developer opens a Pull Request with their changes. An assigned reviewer examines the code diff, leaves comments, suggests improvements and eventually approves or requests changes. The process is asynchronous: no meetings needed, the review happens on the tool. Only after approval is the code merged into the main branch.\nWhat it\u0026rsquo;s for #Code review catches bugs that automated tests don\u0026rsquo;t find, improves code quality, and — an often underestimated aspect — spreads codebase knowledge across the team. If only one person knows a module and they leave, the project has a problem. With code reviews, at least two people know every piece of code.\nWhen to use it #On every Pull Request, without exceptions. It\u0026rsquo;s not a formality: it\u0026rsquo;s an investment in quality. Time spent in review is always less than time spent fixing bugs in production discovered too late.\n","date":"1 January 0001","permalink":"https://ivanluminaria.com/en/glossary/code-review/","section":"Glossary","summary":"\u003cp\u003e\u003cstrong\u003eCode Review\u003c/strong\u003e is the practice where a colleague examines code written by another developer before it is incorporated into the main branch. On GitHub it happens inside Pull Requests.\u003c/p\u003e\n\u003ch2 id=\"how-it-works\" class=\"relative group\"\u003eHow it works \u003cspan class=\"absolute top-0 w-6 transition-opacity opacity-0 -start-6 not-prose group-hover:opacity-100\"\u003e\u003ca class=\"group-hover:text-primary-300 dark:group-hover:text-neutral-700\" style=\"text-decoration-line: none !important;\" href=\"#how-it-works\" aria-label=\"Anchor\"\u003e#\u003c/a\u003e\u003c/span\u003e\u003c/h2\u003e\u003cp\u003eThe developer opens a Pull Request with their changes. An assigned reviewer examines the code diff, leaves comments, suggests improvements and eventually approves or requests changes. The process is asynchronous: no meetings needed, the review happens on the tool. Only after approval is the code merged into the main branch.\u003c/p\u003e","title":"Code Review"},{"content":"Commuting is the daily travel between home and workplace. In large Italian cities like Rome, the average commute absorbs 2-4 hours per day, with direct costs (fuel, parking, public transport) and indirect costs (stress, fatigue, lost productivity).\nHow it works #An IT consultant living 30 km from the office in Rome may spend 1h15-2h30 just for the one-way trip. Over 220 working days, that is 47-89 hours lost per month — up to two working weeks spent in the car producing nothing.\nWhy it matters #Commuting is not just lost time. It is mental energy burned before the workday even begins. An IT consultant works with their mind: analyzing systems, writing code, designing architectures. If that mind arrives already drained after an hour of traffic, the value of the workday is compromised from the start.\nWhat can go wrong #Companies that ignore commuting costs pay a hidden price: for 50 consultants in Rome, the estimated cost is ~1,700,000 euros/year between lost hours, office rent, and related expenses. Smart working at 80% reduces this cost to ~174,000 euros/year from the second year onward.\n","date":"1 January 0001","permalink":"https://ivanluminaria.com/en/glossary/pendolarismo/","section":"Glossary","summary":"\u003cp\u003e\u003cstrong\u003eCommuting\u003c/strong\u003e is the daily travel between home and workplace. In large Italian cities like Rome, the average commute absorbs 2-4 hours per day, with direct costs (fuel, parking, public transport) and indirect costs (stress, fatigue, lost productivity).\u003c/p\u003e\n\u003ch2 id=\"how-it-works\" class=\"relative group\"\u003eHow it works \u003cspan class=\"absolute top-0 w-6 transition-opacity opacity-0 -start-6 not-prose group-hover:opacity-100\"\u003e\u003ca class=\"group-hover:text-primary-300 dark:group-hover:text-neutral-700\" style=\"text-decoration-line: none !important;\" href=\"#how-it-works\" aria-label=\"Anchor\"\u003e#\u003c/a\u003e\u003c/span\u003e\u003c/h2\u003e\u003cp\u003eAn IT consultant living 30 km from the office in Rome may spend 1h15-2h30 just for the one-way trip. Over 220 working days, that is 47-89 hours lost per month — up to two working weeks spent in the car producing nothing.\u003c/p\u003e","title":"Commuting"},{"content":"Compliance (regulatory compliance) is an organization\u0026rsquo;s adherence to the laws, regulations, and industry standards applicable to its activity. In the AI context, it includes GDPR, banking regulations (SOX, PCI-DSS), healthcare regulations, and internal policies on data usage.\nHow it works #Compliance is verified through audits, document reviews, and continuous monitoring. For AI projects, it requires traceability of data used for training, documentation of automated decisions, and the ability to explain how the model arrived at a given output (explainability).\nWhat it\u0026rsquo;s for #It ensures the organization operates within legal and regulatory boundaries. In an AI project, compliance is not optional — it is a design constraint. A model trained on GDPR-subject data without consent is not just a technical risk, it is a violation.\nWhy it matters #In the Governance-Compliance-Automation triangle, compliance is the vertex that can never be sacrificed. The AI Manager must ensure every automation respects regulatory constraints — and this requires deep understanding of both the technology and the regulatory context. It is not enough for AI to work: it must work within the rules.\n","date":"1 January 0001","permalink":"https://ivanluminaria.com/en/glossary/compliance/","section":"Glossary","summary":"\u003cp\u003e\u003cstrong\u003eCompliance\u003c/strong\u003e (regulatory compliance) is an organization\u0026rsquo;s adherence to the laws, regulations, and industry standards applicable to its activity. In the AI context, it includes GDPR, banking regulations (SOX, PCI-DSS), healthcare regulations, and internal policies on data usage.\u003c/p\u003e\n\u003ch2 id=\"how-it-works\" class=\"relative group\"\u003eHow it works \u003cspan class=\"absolute top-0 w-6 transition-opacity opacity-0 -start-6 not-prose group-hover:opacity-100\"\u003e\u003ca class=\"group-hover:text-primary-300 dark:group-hover:text-neutral-700\" style=\"text-decoration-line: none !important;\" href=\"#how-it-works\" aria-label=\"Anchor\"\u003e#\u003c/a\u003e\u003c/span\u003e\u003c/h2\u003e\u003cp\u003eCompliance is verified through audits, document reviews, and continuous monitoring. For AI projects, it requires traceability of data used for training, documentation of automated decisions, and the ability to explain how the model arrived at a given output (explainability).\u003c/p\u003e","title":"Compliance"},{"content":"CTAS (Create Table As Select) is an Oracle SQL command that creates a new table and populates it in a single operation with the results of a SELECT. It is the standard technique for migrating data from one structure to another on large tables.\nHow it works #The command combines DDL and DML: it creates the table with the structure derived from the SELECT and inserts the data in a single pass. With the PARALLEL hint and NOLOGGING mode, copying hundreds of GB can complete in a few hours. After the copy, the original table is renamed, the new one takes its place, and downtime is limited to the few seconds of the rename.\nWhat it\u0026rsquo;s for #CTAS is essential when restructuring a table without being able to use ALTER TABLE directly — for example, adding partitioning to an existing table with billions of rows. It allows working on the new structure while the system is live on the old one.\nWhen to use it #Used for migrations to partitioned tables, reorganising fragmented data, and creating table copies with different structures. In production, it should always be combined with NOLOGGING (to reduce redo logs) and followed by an immediate RMAN backup.\n","date":"1 January 0001","permalink":"https://ivanluminaria.com/en/glossary/ctas/","section":"Glossary","summary":"\u003cp\u003e\u003cstrong\u003eCTAS\u003c/strong\u003e (Create Table As Select) is an Oracle SQL command that creates a new table and populates it in a single operation with the results of a SELECT. It is the standard technique for migrating data from one structure to another on large tables.\u003c/p\u003e\n\u003ch2 id=\"how-it-works\" class=\"relative group\"\u003eHow it works \u003cspan class=\"absolute top-0 w-6 transition-opacity opacity-0 -start-6 not-prose group-hover:opacity-100\"\u003e\u003ca class=\"group-hover:text-primary-300 dark:group-hover:text-neutral-700\" style=\"text-decoration-line: none !important;\" href=\"#how-it-works\" aria-label=\"Anchor\"\u003e#\u003c/a\u003e\u003c/span\u003e\u003c/h2\u003e\u003cp\u003eThe command combines DDL and DML: it creates the table with the structure derived from the SELECT and inserts the data in a single pass. With the \u003ccode\u003ePARALLEL\u003c/code\u003e hint and \u003ccode\u003eNOLOGGING\u003c/code\u003e mode, copying hundreds of GB can complete in a few hours. After the copy, the original table is renamed, the new one takes its place, and downtime is limited to the few seconds of the rename.\u003c/p\u003e","title":"CTAS"},{"content":"The cutover is the moment when a production system is moved from the old infrastructure to the new one. It\u0026rsquo;s the most visible phase of a migration — the one everyone remembers, for better or worse.\nAnatomy of a cutover #A well-planned cutover follows a detailed runbook with numbered steps, estimated times, success criteria and rollback procedures for each step. Typical components:\nApplication stop — closing connections and verifying no sessions are active Final synchronization — in a Data Guard migration, verifying transport lag and apply lag are at zero Switchover/migration — the technical operation that transfers the service Validation — connectivity tests, verification queries, functional tests Gradual opening — progressive readmission of users Downtime and windows #A cutover\u0026rsquo;s downtime is the time between the last user disconnecting and the first user reconnecting. With Data Guard switchover, downtime can be in the order of minutes. With Data Pump, it can be hours or days.\nThe cutover window is planned during periods of lowest usage: nights, weekends, holidays. But \u0026ldquo;lowest usage\u0026rdquo; doesn\u0026rsquo;t mean \u0026ldquo;zero usage\u0026rdquo; — in manufacturing companies with 24/7 shifts, there\u0026rsquo;s no moment when the database isn\u0026rsquo;t needed by someone.\nRollback #Every cutover must have a rollback plan. With Data Guard, rollback is a second switchover — relatively straightforward. With Data Pump, rollback means restarting the original database and accepting the loss of transactions that occurred after the migration began. The quality of the rollback plan is inversely proportional to the probability of needing it — but woe to those who don\u0026rsquo;t have one.\n","date":"1 January 0001","permalink":"https://ivanluminaria.com/en/glossary/cutover/","section":"Glossary","summary":"\u003cp\u003eThe \u003cstrong\u003ecutover\u003c/strong\u003e is the moment when a production system is moved from the old infrastructure to the new one. It\u0026rsquo;s the most visible phase of a migration — the one everyone remembers, for better or worse.\u003c/p\u003e\n\u003ch2 id=\"anatomy-of-a-cutover\" class=\"relative group\"\u003eAnatomy of a cutover \u003cspan class=\"absolute top-0 w-6 transition-opacity opacity-0 -start-6 not-prose group-hover:opacity-100\"\u003e\u003ca class=\"group-hover:text-primary-300 dark:group-hover:text-neutral-700\" style=\"text-decoration-line: none !important;\" href=\"#anatomy-of-a-cutover\" aria-label=\"Anchor\"\u003e#\u003c/a\u003e\u003c/span\u003e\u003c/h2\u003e\u003cp\u003eA well-planned cutover follows a detailed runbook with numbered steps, estimated times, success criteria and rollback procedures for each step. Typical components:\u003c/p\u003e","title":"Cutover"},{"content":"The Daily Standup is a brief daily meeting (maximum 15 minutes) where each team member answers three questions: what I did yesterday, what I will do today, is anything blocking me. The purpose is to synchronize the team, not to solve problems.\nHow it works #Each person has about two minutes for their update. Problems are flagged but not discussed: resolution happens afterwards, between the people involved. The time constraint is what makes the standup effective — without it, it degenerates into a 45-minute status meeting.\nWhat it\u0026rsquo;s for #It keeps the team aligned on project status, surfaces blockers before they become critical, and creates a daily rhythm that gives structure to the work. A well-managed standup replaces dozens of emails and Slack messages.\nWhat can go wrong #The degeneration pattern is predictable: the first week it lasts 15 minutes, the third week 35, the fourth week the team starts skipping it. The most common causes are \u0026ldquo;thread killers\u0026rdquo; (technical discussions between two people while others wait), improvised demos, and managers asking probing questions.\n","date":"1 January 0001","permalink":"https://ivanluminaria.com/en/glossary/daily-standup/","section":"Glossary","summary":"\u003cp\u003eThe \u003cstrong\u003eDaily Standup\u003c/strong\u003e is a brief daily meeting (maximum 15 minutes) where each team member answers three questions: what I did yesterday, what I will do today, is anything blocking me. The purpose is to synchronize the team, not to solve problems.\u003c/p\u003e\n\u003ch2 id=\"how-it-works\" class=\"relative group\"\u003eHow it works \u003cspan class=\"absolute top-0 w-6 transition-opacity opacity-0 -start-6 not-prose group-hover:opacity-100\"\u003e\u003ca class=\"group-hover:text-primary-300 dark:group-hover:text-neutral-700\" style=\"text-decoration-line: none !important;\" href=\"#how-it-works\" aria-label=\"Anchor\"\u003e#\u003c/a\u003e\u003c/span\u003e\u003c/h2\u003e\u003cp\u003eEach person has about two minutes for their update. Problems are flagged but not discussed: resolution happens afterwards, between the people involved. The time constraint is what makes the standup effective — without it, it degenerates into a 45-minute status meeting.\u003c/p\u003e","title":"Daily Standup"},{"content":"Data Governance is the set of policies, processes, roles, and standards an organization adopts to ensure its data is accurate, secure, compliant with regulations, and used consistently.\nHow it works #It defines who is responsible for data (data owner, data steward), what quality rules to apply, how to classify data by sensitivity, and how to trace its provenance (data lineage). In an AI context, it also includes verifying the provenance and quality of data used for model training.\nWhat it\u0026rsquo;s for #Without data governance, an organization doesn\u0026rsquo;t know what data it has, where it is, who can access it, and whether it\u0026rsquo;s reliable. In projects with AI components, governance is the prerequisite for preventing models from being trained on dirty, unauthorized, or regulation-bound data such as GDPR-protected information.\nWhy it matters #In every AI project in a regulated environment, the Governance-Compliance-Automation triangle must stay balanced. An efficient AI automation that violates data governance policies is a risk. Perfect governance that blocks all automation stalls the project. The AI Manager keeps these three vertices in balance.\n","date":"1 January 0001","permalink":"https://ivanluminaria.com/en/glossary/data-governance/","section":"Glossary","summary":"\u003cp\u003e\u003cstrong\u003eData Governance\u003c/strong\u003e is the set of policies, processes, roles, and standards an organization adopts to ensure its data is accurate, secure, compliant with regulations, and used consistently.\u003c/p\u003e\n\u003ch2 id=\"how-it-works\" class=\"relative group\"\u003eHow it works \u003cspan class=\"absolute top-0 w-6 transition-opacity opacity-0 -start-6 not-prose group-hover:opacity-100\"\u003e\u003ca class=\"group-hover:text-primary-300 dark:group-hover:text-neutral-700\" style=\"text-decoration-line: none !important;\" href=\"#how-it-works\" aria-label=\"Anchor\"\u003e#\u003c/a\u003e\u003c/span\u003e\u003c/h2\u003e\u003cp\u003eIt defines who is responsible for data (data owner, data steward), what quality rules to apply, how to classify data by sensitivity, and how to trace its provenance (data lineage). In an AI context, it also includes verifying the provenance and quality of data used for model training.\u003c/p\u003e","title":"Data Governance"},{"content":"Data Guard is Oracle\u0026rsquo;s technology for maintaining one or more synchronized copies (standby) of a production database (primary). The standby continuously receives and applies redo logs generated by the primary, staying aligned in real time or near-real time.\nHow it works #The primary generates redo logs with every transaction. These logs are transmitted to the standby over the network, where they are applied in one of two ways:\nPhysical standby: applies redo at block level (exact replica, byte for byte) Logical standby: reconstructs SQL statements from the redo and re-executes them If the primary fails, the standby can become the new primary via switchover (planned) or failover (emergency).\nActive Data Guard #The Active Data Guard variant allows the standby to be opened in read-only mode while it continues applying redo. This enables using it for reports, backups and analytical queries, offloading the primary.\nProtection modes # Mode Behaviour Data loss MaxPerformance Asynchronous replication, no impact on primary performance Possible (a few seconds) MaxAvailability Synchronous replication, degrades to MaxPerformance if standby unreachable Zero under normal conditions MaxProtection Synchronous replication, primary halts if standby doesn\u0026rsquo;t confirm Guaranteed zero ","date":"1 January 0001","permalink":"https://ivanluminaria.com/en/glossary/data-guard/","section":"Glossary","summary":"\u003cp\u003e\u003cstrong\u003eData Guard\u003c/strong\u003e is Oracle\u0026rsquo;s technology for maintaining one or more synchronized copies (standby) of a production database (primary). The standby continuously receives and applies redo logs generated by the primary, staying aligned in real time or near-real time.\u003c/p\u003e\n\u003ch2 id=\"how-it-works\" class=\"relative group\"\u003eHow it works \u003cspan class=\"absolute top-0 w-6 transition-opacity opacity-0 -start-6 not-prose group-hover:opacity-100\"\u003e\u003ca class=\"group-hover:text-primary-300 dark:group-hover:text-neutral-700\" style=\"text-decoration-line: none !important;\" href=\"#how-it-works\" aria-label=\"Anchor\"\u003e#\u003c/a\u003e\u003c/span\u003e\u003c/h2\u003e\u003cp\u003eThe primary generates redo logs with every transaction. These logs are transmitted to the standby over the network, where they are applied in one of two ways:\u003c/p\u003e","title":"Data Guard"},{"content":"A Data Warehouse (DWH) is a data storage system specifically designed for analysis, reporting and business decision support. Unlike operational databases (OLTP), a DWH collects data from multiple sources, transforms it and organises it into structures optimised for analytical queries.\nHow it works #Data is extracted from source systems (ERPs, CRMs, business applications), transformed through ETL processes that clean, normalise and enrich it, and finally loaded into the DWH. The typical data model is the star schema: a central fact table with numerical measures linked to dimension tables that describe context (time, customer, product, geography).\nWhat it\u0026rsquo;s for #A DWH enables answering business questions that operational systems cannot handle: historical trends, comparative analysis across periods, cross-system aggregations, business KPIs. It separates analytical workload from transactional workload, preventing reporting queries from impacting operational application performance.\nWhen to use it #A DWH is needed when a company needs to integrate data from diverse sources to produce consolidated analyses. Complexity and costs depend on the number of source systems, data volume and required update frequency.\n","date":"1 January 0001","permalink":"https://ivanluminaria.com/en/glossary/data-warehouse/","section":"Glossary","summary":"\u003cp\u003eA \u003cstrong\u003eData Warehouse\u003c/strong\u003e (DWH) is a data storage system specifically designed for analysis, reporting and business decision support. Unlike operational databases (OLTP), a DWH collects data from multiple sources, transforms it and organises it into structures optimised for analytical queries.\u003c/p\u003e\n\u003ch2 id=\"how-it-works\" class=\"relative group\"\u003eHow it works \u003cspan class=\"absolute top-0 w-6 transition-opacity opacity-0 -start-6 not-prose group-hover:opacity-100\"\u003e\u003ca class=\"group-hover:text-primary-300 dark:group-hover:text-neutral-700\" style=\"text-decoration-line: none !important;\" href=\"#how-it-works\" aria-label=\"Anchor\"\u003e#\u003c/a\u003e\u003c/span\u003e\u003c/h2\u003e\u003cp\u003eData is extracted from source systems (ERPs, CRMs, business applications), transformed through ETL processes that clean, normalise and enrich it, and finally loaded into the DWH. The typical data model is the star schema: a central fact table with numerical measures linked to dimension tables that describe context (time, customer, product, geography).\u003c/p\u003e","title":"Data Warehouse"},{"content":"A Dead Tuple is a row in a PostgreSQL table that has been updated (UPDATE) or deleted (DELETE) but has not yet been physically removed. It remains in the data pages, occupying disk space and slowing down scans.\nHow it works #When PostgreSQL executes an UPDATE, it does not overwrite the original row: it creates a new version and marks the old one as \u0026ldquo;dead.\u0026rdquo; The old row remains physically in the data page until VACUUM cleans it up. Dead tuples are the price of the MVCC model — necessary to guarantee transactional isolation.\nWhat it\u0026rsquo;s for #Dead tuples are a key indicator of table health. The pg_stat_user_tables view shows n_dead_tup and last_autovacuum — if dead tuples grow faster than autovacuum can clean, the table has a problem. A dead_tuple_percent above 20-30% is a warning sign.\nWhat can go wrong #On a table with 500,000 updates per day and autovacuum defaults (scale_factor 0.2), VACUUM triggers every 4 days. Meanwhile dead tuples accumulate, tables bloat, and queries slow down progressively — the \u0026ldquo;Monday fine, Friday disaster\u0026rdquo; pattern.\n","date":"1 January 0001","permalink":"https://ivanluminaria.com/en/glossary/dead-tuple/","section":"Glossary","summary":"\u003cp\u003eA \u003cstrong\u003eDead Tuple\u003c/strong\u003e is a row in a PostgreSQL table that has been updated (UPDATE) or deleted (DELETE) but has not yet been physically removed. It remains in the data pages, occupying disk space and slowing down scans.\u003c/p\u003e\n\u003ch2 id=\"how-it-works\" class=\"relative group\"\u003eHow it works \u003cspan class=\"absolute top-0 w-6 transition-opacity opacity-0 -start-6 not-prose group-hover:opacity-100\"\u003e\u003ca class=\"group-hover:text-primary-300 dark:group-hover:text-neutral-700\" style=\"text-decoration-line: none !important;\" href=\"#how-it-works\" aria-label=\"Anchor\"\u003e#\u003c/a\u003e\u003c/span\u003e\u003c/h2\u003e\u003cp\u003eWhen PostgreSQL executes an UPDATE, it does not overwrite the original row: it creates a new version and marks the old one as \u0026ldquo;dead.\u0026rdquo; The old row remains physically in the data page until VACUUM cleans it up. Dead tuples are the price of the MVCC model — necessary to guarantee transactional isolation.\u003c/p\u003e","title":"Dead Tuple"},{"content":"DEFAULT PRIVILEGES is a PostgreSQL mechanism that allows defining in advance the privileges that will be automatically assigned to all future objects created in a schema. It is configured with the ALTER DEFAULT PRIVILEGES command.\nHow it works #The command ALTER DEFAULT PRIVILEGES IN SCHEMA schema1 GRANT SELECT ON TABLES TO srv_monitoring ensures that every new table created in schema1 is automatically readable by srv_monitoring. Without this configuration, future tables would require a manual GRANT each time.\nWhat it\u0026rsquo;s for #It is the part that most administrators forget when creating read-only users. GRANTs on ALL TABLES IN SCHEMA cover only existing tables. Tables created afterwards require new GRANTs — unless DEFAULT PRIVILEGES are used. Without them, the monitoring user stops working at the first new table.\nWhat can go wrong #DEFAULT PRIVILEGES apply to the ROLE that creates the objects. If multiple users create tables in a schema, default privileges must be configured for each creator. This detail often causes hard-to-diagnose errors: \u0026ldquo;the GRANT is there, but the new table isn\u0026rsquo;t readable.\u0026rdquo;\n","date":"1 January 0001","permalink":"https://ivanluminaria.com/en/glossary/default-privileges/","section":"Glossary","summary":"\u003cp\u003e\u003cstrong\u003eDEFAULT PRIVILEGES\u003c/strong\u003e is a PostgreSQL mechanism that allows defining in advance the privileges that will be automatically assigned to all future objects created in a schema. It is configured with the \u003ccode\u003eALTER DEFAULT PRIVILEGES\u003c/code\u003e command.\u003c/p\u003e\n\u003ch2 id=\"how-it-works\" class=\"relative group\"\u003eHow it works \u003cspan class=\"absolute top-0 w-6 transition-opacity opacity-0 -start-6 not-prose group-hover:opacity-100\"\u003e\u003ca class=\"group-hover:text-primary-300 dark:group-hover:text-neutral-700\" style=\"text-decoration-line: none !important;\" href=\"#how-it-works\" aria-label=\"Anchor\"\u003e#\u003c/a\u003e\u003c/span\u003e\u003c/h2\u003e\u003cp\u003eThe command \u003ccode\u003eALTER DEFAULT PRIVILEGES IN SCHEMA schema1 GRANT SELECT ON TABLES TO srv_monitoring\u003c/code\u003e ensures that every new table created in \u003ccode\u003eschema1\u003c/code\u003e is automatically readable by \u003ccode\u003esrv_monitoring\u003c/code\u003e. Without this configuration, future tables would require a manual GRANT each time.\u003c/p\u003e","title":"DEFAULT PRIVILEGES"},{"content":"default_statistics_target is the PostgreSQL parameter that defines the number of samples collected by the ANALYZE command to build statistics for each column. The default value is 100.\nHow it works #PostgreSQL samples a certain number of values for each column and uses them to build two structures:\nMost common values (MCV): the list of the most frequent values, with their respective frequencies Histogram: the distribution of the remaining values, divided into equal-population buckets The default_statistics_target parameter determines how many elements these structures will have. With the default value of 100, the histogram will have 100 buckets and the MCV list will contain up to 100 values.\nWhen to increase it #For small tables or tables with uniform distribution, 100 samples are sufficient. For large tables with skewed distribution — where a few values dominate most rows — 100 samples can give a distorted picture, leading the optimizer to wrong cardinality estimates.\nYou can increase the target at the column level:\nALTER TABLE orders ALTER COLUMN status SET STATISTICS 500; ANALYZE orders; Values between 500 and 1000 significantly improve estimate quality on columns with non-uniform distribution.\nPractical limits #Beyond 1000 the benefit is marginal and ANALYZE itself becomes slower, because it needs to sample more rows and build larger structures. It\u0026rsquo;s a fine-tuning adjustment: apply it only to columns that actually cause wrong estimates, not to every column in every table.\n","date":"1 January 0001","permalink":"https://ivanluminaria.com/en/glossary/postgresql-default-statistics-target/","section":"Glossary","summary":"\u003cp\u003e\u003cstrong\u003edefault_statistics_target\u003c/strong\u003e is the PostgreSQL parameter that defines the number of samples collected by the \u003ccode\u003eANALYZE\u003c/code\u003e command to build statistics for each column. The default value is 100.\u003c/p\u003e\n\u003ch2 id=\"how-it-works\" class=\"relative group\"\u003eHow it works \u003cspan class=\"absolute top-0 w-6 transition-opacity opacity-0 -start-6 not-prose group-hover:opacity-100\"\u003e\u003ca class=\"group-hover:text-primary-300 dark:group-hover:text-neutral-700\" style=\"text-decoration-line: none !important;\" href=\"#how-it-works\" aria-label=\"Anchor\"\u003e#\u003c/a\u003e\u003c/span\u003e\u003c/h2\u003e\u003cp\u003ePostgreSQL samples a certain number of values for each column and uses them to build two structures:\u003c/p\u003e\n\u003cul\u003e\n\u003cli\u003e\u003cstrong\u003eMost common values (MCV)\u003c/strong\u003e: the list of the most frequent values, with their respective frequencies\u003c/li\u003e\n\u003cli\u003e\u003cstrong\u003eHistogram\u003c/strong\u003e: the distribution of the remaining values, divided into equal-population buckets\u003c/li\u003e\n\u003c/ul\u003e\n\u003cp\u003eThe \u003ccode\u003edefault_statistics_target\u003c/code\u003e parameter determines how many elements these structures will have. With the default value of 100, the histogram will have 100 buckets and the MCV list will contain up to 100 values.\u003c/p\u003e","title":"default_statistics_target"},{"content":"Directive 2011/7/EU is the European regulation on late payment in commercial transactions. It establishes clear rules: standard term of 30 days, maximum 60 between businesses (with explicit agreement), 30 for public administration, and automatic late interest at the ECB rate + 8%.\nHow it works #The directive was transposed into Italian law through Legislative Decree 231/2002 (amended in 2012). On paper the rules exist: 30 days standard, automatic interest, flat compensation of €40 per late-paid invoice. In Italian practice it is as if they don\u0026rsquo;t exist — the average Italian DSO is 80 days, well beyond the 60-day maximum.\nWhat it\u0026rsquo;s for #It should protect suppliers — particularly small businesses and freelancers — from structural late payments. In countries like Germany (24-day DSO) and the Netherlands (27-day DSO) the directive works. In Italy the gap between law and reality is enormous.\nWhy it matters #The directive demonstrates that Italy\u0026rsquo;s late payment problem is not legislative — the laws exist. It is cultural and structural: the reputational cost of asserting one\u0026rsquo;s rights exceeds the economic benefit, and the system relies on the creditor\u0026rsquo;s docility.\n","date":"1 January 0001","permalink":"https://ivanluminaria.com/en/glossary/direttiva-2011-7-ue/","section":"Glossary","summary":"\u003cp\u003e\u003cstrong\u003eDirective 2011/7/EU\u003c/strong\u003e is the European regulation on late payment in commercial transactions. It establishes clear rules: standard term of 30 days, maximum 60 between businesses (with explicit agreement), 30 for public administration, and automatic late interest at the ECB rate + 8%.\u003c/p\u003e\n\u003ch2 id=\"how-it-works\" class=\"relative group\"\u003eHow it works \u003cspan class=\"absolute top-0 w-6 transition-opacity opacity-0 -start-6 not-prose group-hover:opacity-100\"\u003e\u003ca class=\"group-hover:text-primary-300 dark:group-hover:text-neutral-700\" style=\"text-decoration-line: none !important;\" href=\"#how-it-works\" aria-label=\"Anchor\"\u003e#\u003c/a\u003e\u003c/span\u003e\u003c/h2\u003e\u003cp\u003eThe directive was transposed into Italian law through Legislative Decree 231/2002 (amended in 2012). On paper the rules exist: 30 days standard, automatic interest, flat compensation of €40 per late-paid invoice. In Italian practice it is as if they don\u0026rsquo;t exist — the average Italian DSO is 80 days, well beyond the 60-day maximum.\u003c/p\u003e","title":"Directive 2011/7/EU"},{"content":"Drill-down is a report navigation operation that allows moving from an aggregated level to a more detailed level, descending through a hierarchy.\nHow it works #In a Top Group → Group → Client hierarchy:\nStart at the highest level: total revenue by Top Group Click on a Top Group to see its Groups (first-level drill-down) Click on a Group to see individual Clients (second-level drill-down) The reverse operation — going back up from detail to aggregate — is called drill-up (or roll-up).\nRequirements for correct drill-down #To work without errors, drill-down requires:\nA complete hierarchy: no missing levels (no NULLs) Total consistency: the sum of values at the detail level must match the total at the higher level Balanced structure: all branches of the hierarchy must have the same depth If the hierarchy is unbalanced (ragged hierarchy), drill-down produces incomplete or incorrect results. Self-parenting solves this by balancing the structure upstream.\nDrill-down vs filter #Drill-down is not a simple filter: it\u0026rsquo;s structured navigation along a predefined hierarchy. A filter shows a subset of data; a drill-down shows the next level of detail within a hierarchical context.\n","date":"1 January 0001","permalink":"https://ivanluminaria.com/en/glossary/drill-down/","section":"Glossary","summary":"\u003cp\u003e\u003cstrong\u003eDrill-down\u003c/strong\u003e is a report navigation operation that allows moving from an aggregated level to a more detailed level, descending through a hierarchy.\u003c/p\u003e\n\u003ch2 id=\"how-it-works\" class=\"relative group\"\u003eHow it works \u003cspan class=\"absolute top-0 w-6 transition-opacity opacity-0 -start-6 not-prose group-hover:opacity-100\"\u003e\u003ca class=\"group-hover:text-primary-300 dark:group-hover:text-neutral-700\" style=\"text-decoration-line: none !important;\" href=\"#how-it-works\" aria-label=\"Anchor\"\u003e#\u003c/a\u003e\u003c/span\u003e\u003c/h2\u003e\u003cp\u003eIn a Top Group → Group → Client hierarchy:\u003c/p\u003e\n\u003col\u003e\n\u003cli\u003eStart at the highest level: total revenue by Top Group\u003c/li\u003e\n\u003cli\u003eClick on a Top Group to see its Groups (first-level drill-down)\u003c/li\u003e\n\u003cli\u003eClick on a Group to see individual Clients (second-level drill-down)\u003c/li\u003e\n\u003c/ol\u003e\n\u003cp\u003eThe reverse operation — going back up from detail to aggregate — is called \u003cstrong\u003edrill-up\u003c/strong\u003e (or roll-up).\u003c/p\u003e","title":"Drill-down"},{"content":"DSO (Days Sales Outstanding) is the metric that measures the average number of days a company takes to collect its receivables after invoicing. It is the primary indicator of payment speed in a market.\nHow it works #It is calculated as: (Trade Receivables / Revenue) × Days in Period. A DSO of 30 means clients pay within a month on average. In Italy the average DSO is 80 days according to the European Payment Report — nearly three times the northern European average (24-27 days).\nWhat it\u0026rsquo;s for #For a freelance consultant, DSO determines working capital needs. With a 90-day DSO and €5,500/month revenue, at least €16,500 in reserves are needed to cover the first three months without income. Without reserves, the consultant is financing their client at zero cost.\nWhy it matters #Italy is off the scale relative to the EU Directive that sets the maximum term at 60 days. An 80-day DSO is not just a financial problem — it is a structural indicator of a market where bargaining power is tilted in favor of the client.\n","date":"1 January 0001","permalink":"https://ivanluminaria.com/en/glossary/dso/","section":"Glossary","summary":"\u003cp\u003e\u003cstrong\u003eDSO\u003c/strong\u003e (Days Sales Outstanding) is the metric that measures the average number of days a company takes to collect its receivables after invoicing. It is the primary indicator of payment speed in a market.\u003c/p\u003e\n\u003ch2 id=\"how-it-works\" class=\"relative group\"\u003eHow it works \u003cspan class=\"absolute top-0 w-6 transition-opacity opacity-0 -start-6 not-prose group-hover:opacity-100\"\u003e\u003ca class=\"group-hover:text-primary-300 dark:group-hover:text-neutral-700\" style=\"text-decoration-line: none !important;\" href=\"#how-it-works\" aria-label=\"Anchor\"\u003e#\u003c/a\u003e\u003c/span\u003e\u003c/h2\u003e\u003cp\u003eIt is calculated as: \u003ccode\u003e(Trade Receivables / Revenue) × Days in Period\u003c/code\u003e. A DSO of 30 means clients pay within a month on average. In Italy the average DSO is 80 days according to the European Payment Report — nearly three times the northern European average (24-27 days).\u003c/p\u003e","title":"DSO"},{"content":"ETL (Extract, Transform, Load) is the fundamental process through which data is moved from source systems (operational databases, files, APIs) into the data warehouse.\nThe three phases # Extract: pulling data from source systems. Can be full (complete load) or incremental (only new or changed data) Transform: cleaning, validating, standardizing and enriching the data. This is where business rules are applied, dimension lookups performed, derived calculations computed Load: loading the transformed data into the data warehouse tables (fact and dimension) Why it matters #ETL is the least visible but most critical part of a data warehouse. If data is extracted incompletely, transformed with incorrect rules, or loaded without checks, everything built on top — reports, dashboards, decisions — will be wrong.\nA well-designed ETL also determines the loading window: how long it takes to refresh the data warehouse. In real-world environments, going from 4 hours to 25 minutes can mean the difference between data being current by morning or by afternoon.\nELT vs ETL #With the rise of cloud data warehouses and high-performance columnar engines, the ELT (Extract, Load, Transform) pattern has become common: data is loaded raw into the warehouse and transformed there, leveraging the SQL engine\u0026rsquo;s processing power. The core concept remains the same — what changes is where the transformation happens.\n","date":"1 January 0001","permalink":"https://ivanluminaria.com/en/glossary/etl/","section":"Glossary","summary":"\u003cp\u003e\u003cstrong\u003eETL\u003c/strong\u003e (Extract, Transform, Load) is the fundamental process through which data is moved from source systems (operational databases, files, APIs) into the data warehouse.\u003c/p\u003e\n\u003ch2 id=\"the-three-phases\" class=\"relative group\"\u003eThe three phases \u003cspan class=\"absolute top-0 w-6 transition-opacity opacity-0 -start-6 not-prose group-hover:opacity-100\"\u003e\u003ca class=\"group-hover:text-primary-300 dark:group-hover:text-neutral-700\" style=\"text-decoration-line: none !important;\" href=\"#the-three-phases\" aria-label=\"Anchor\"\u003e#\u003c/a\u003e\u003c/span\u003e\u003c/h2\u003e\u003cul\u003e\n\u003cli\u003e\u003cstrong\u003eExtract\u003c/strong\u003e: pulling data from source systems. Can be full (complete load) or incremental (only new or changed data)\u003c/li\u003e\n\u003cli\u003e\u003cstrong\u003eTransform\u003c/strong\u003e: cleaning, validating, standardizing and enriching the data. This is where business rules are applied, dimension lookups performed, derived calculations computed\u003c/li\u003e\n\u003cli\u003e\u003cstrong\u003eLoad\u003c/strong\u003e: loading the transformed data into the data warehouse tables (fact and dimension)\u003c/li\u003e\n\u003c/ul\u003e\n\u003ch2 id=\"why-it-matters\" class=\"relative group\"\u003eWhy it matters \u003cspan class=\"absolute top-0 w-6 transition-opacity opacity-0 -start-6 not-prose group-hover:opacity-100\"\u003e\u003ca class=\"group-hover:text-primary-300 dark:group-hover:text-neutral-700\" style=\"text-decoration-line: none !important;\" href=\"#why-it-matters\" aria-label=\"Anchor\"\u003e#\u003c/a\u003e\u003c/span\u003e\u003c/h2\u003e\u003cp\u003eETL is the least visible but most critical part of a data warehouse. If data is extracted incompletely, transformed with incorrect rules, or loaded without checks, everything built on top — reports, dashboards, decisions — will be wrong.\u003c/p\u003e","title":"ETL"},{"content":"Exchange Partition is an Oracle DDL operation that allows you to instantly swap the contents of a partition with those of a non-partitioned table. Not a single byte of data is moved — the operation only modifies pointers in the data dictionary.\nHow it works #The ALTER TABLE ... EXCHANGE PARTITION ... WITH TABLE ... command modifies metadata in the data dictionary so that the physical segments of the partition and the staging table swap ownership. The staging table becomes the partition and vice versa. The operation takes less than a second regardless of data volume, because it involves no physical data movement.\nWhat it\u0026rsquo;s for #In data warehouses, exchange partition is the primary tool for bulk data loading. The typical process is: the ETL loads data into a staging table, builds indexes, validates the data, and then executes the exchange with the target partition. During the exchange, queries on other partitions continue working without interruption.\nWhat can go wrong #The WITHOUT VALIDATION clause skips the check that the staging table\u0026rsquo;s data actually falls within the partition\u0026rsquo;s range — it speeds up the operation but requires the ETL to guarantee data correctness. If the staging data contains out-of-range dates, they end up in the wrong partition with no error raised. The INCLUDING INDEXES clause requires the staging table to have indexes with the same structure as the partitioned table\u0026rsquo;s local indexes.\n","date":"1 January 0001","permalink":"https://ivanluminaria.com/en/glossary/exchange-partition/","section":"Glossary","summary":"\u003cp\u003e\u003cstrong\u003eExchange Partition\u003c/strong\u003e is an Oracle DDL operation that allows you to instantly swap the contents of a partition with those of a non-partitioned table. Not a single byte of data is moved — the operation only modifies pointers in the data dictionary.\u003c/p\u003e\n\u003ch2 id=\"how-it-works\" class=\"relative group\"\u003eHow it works \u003cspan class=\"absolute top-0 w-6 transition-opacity opacity-0 -start-6 not-prose group-hover:opacity-100\"\u003e\u003ca class=\"group-hover:text-primary-300 dark:group-hover:text-neutral-700\" style=\"text-decoration-line: none !important;\" href=\"#how-it-works\" aria-label=\"Anchor\"\u003e#\u003c/a\u003e\u003c/span\u003e\u003c/h2\u003e\u003cp\u003eThe \u003ccode\u003eALTER TABLE ... EXCHANGE PARTITION ... WITH TABLE ...\u003c/code\u003e command modifies metadata in the data dictionary so that the physical segments of the partition and the staging table swap ownership. The staging table becomes the partition and vice versa. The operation takes less than a second regardless of data volume, because it involves no physical data movement.\u003c/p\u003e","title":"Exchange Partition"},{"content":"An execution plan is the sequence of operations the database chooses to resolve a SQL query. When you write a SELECT with JOINs, WHERE filters, and sorts, the optimizer evaluates dozens of possible strategies and picks one based on available statistics.\nHow it works #The plan is represented as a tree of nodes: each node is an operation (scan, join, sort, aggregate) that receives data from its child nodes and passes it to its parent. In PostgreSQL you view it with EXPLAIN (estimated plan) or EXPLAIN ANALYZE (actual plan with real timings and row counts).\nThe optimizer decides for each node which strategy to use: sequential scan or index scan for table access, nested loop, hash join or merge join for joins, sort or hash for groupings.\nWhy it matters #Correctly reading an execution plan is the most important skill for query tuning. Looking at the total time is not enough: you need to compare estimated rows against actual rows node by node, check buffer I/O, and identify where the optimizer made poor choices.\nA wrong estimate on even a single node can cascade through the entire plan, turning a millisecond query into one that takes minutes.\nWhat can go wrong #The most common problems in execution plans:\nWrong cardinality estimates: the optimizer thinks a table returns 100 rows when it actually returns 2 million Wrong join type: a nested loop chosen where a hash join was needed, due to stale statistics Ignored index: a sequential scan on a large table because statistics don\u0026rsquo;t reflect the real data distribution Disk spill: sort or hash operations that don\u0026rsquo;t fit in work_mem and end up writing to disk ","date":"1 January 0001","permalink":"https://ivanluminaria.com/en/glossary/execution-plan/","section":"Glossary","summary":"\u003cp\u003eAn \u003cstrong\u003eexecution plan\u003c/strong\u003e is the sequence of operations the database chooses to resolve a SQL query. When you write a SELECT with JOINs, WHERE filters, and sorts, the optimizer evaluates dozens of possible strategies and picks one based on available statistics.\u003c/p\u003e\n\u003ch2 id=\"how-it-works\" class=\"relative group\"\u003eHow it works \u003cspan class=\"absolute top-0 w-6 transition-opacity opacity-0 -start-6 not-prose group-hover:opacity-100\"\u003e\u003ca class=\"group-hover:text-primary-300 dark:group-hover:text-neutral-700\" style=\"text-decoration-line: none !important;\" href=\"#how-it-works\" aria-label=\"Anchor\"\u003e#\u003c/a\u003e\u003c/span\u003e\u003c/h2\u003e\u003cp\u003eThe plan is represented as a tree of nodes: each node is an operation (scan, join, sort, aggregate) that receives data from its child nodes and passes it to its parent. In PostgreSQL you view it with \u003ccode\u003eEXPLAIN\u003c/code\u003e (estimated plan) or \u003ccode\u003eEXPLAIN ANALYZE\u003c/code\u003e (actual plan with real timings and row counts).\u003c/p\u003e","title":"Execution Plan"},{"content":"The Facilitator is the person tasked with guiding the flow of a meeting. They are not the decision-maker — they ensure that decisions get made in an orderly manner, within the allotted time, and with input from all participants.\nHow it works #The facilitator keeps time, manages speaking turns, cuts off-topic discussions (\u0026ldquo;I\u0026rsquo;ll note it in the parking lot, we\u0026rsquo;ll discuss it after\u0026rdquo;), and ensures the standup does not exceed 15 minutes. The role can be fixed or rotated within the team.\nWhat it\u0026rsquo;s for #Without a facilitator, meetings naturally expand. Someone speaks more than their share, someone else never speaks, and topics multiply without control. The facilitator is the guardian of everyone\u0026rsquo;s time — not authoritarian, but respectful.\nWhy it matters #The best facilitator is one who cuts naturally: \u0026ldquo;Interesting, we\u0026rsquo;ll discuss it right after. Marco, your turn.\u0026rdquo; Without this role, the standup degenerates within three weeks. The difference between a 15-minute standup and a 45-minute one is almost always the presence (or absence) of a facilitator with a backbone.\n","date":"1 January 0001","permalink":"https://ivanluminaria.com/en/glossary/facilitatore/","section":"Glossary","summary":"\u003cp\u003eThe \u003cstrong\u003eFacilitator\u003c/strong\u003e is the person tasked with guiding the flow of a meeting. They are not the decision-maker — they ensure that decisions get made in an orderly manner, within the allotted time, and with input from all participants.\u003c/p\u003e\n\u003ch2 id=\"how-it-works\" class=\"relative group\"\u003eHow it works \u003cspan class=\"absolute top-0 w-6 transition-opacity opacity-0 -start-6 not-prose group-hover:opacity-100\"\u003e\u003ca class=\"group-hover:text-primary-300 dark:group-hover:text-neutral-700\" style=\"text-decoration-line: none !important;\" href=\"#how-it-works\" aria-label=\"Anchor\"\u003e#\u003c/a\u003e\u003c/span\u003e\u003c/h2\u003e\u003cp\u003eThe facilitator keeps time, manages speaking turns, cuts off-topic discussions (\u0026ldquo;I\u0026rsquo;ll note it in the parking lot, we\u0026rsquo;ll discuss it after\u0026rdquo;), and ensures the standup does not exceed 15 minutes. The role can be fixed or rotated within the team.\u003c/p\u003e","title":"Facilitator"},{"content":"A fact table is the central table of a star schema in a data warehouse. It contains numeric measures — amounts, quantities, counts, durations — and the foreign keys that connect it to dimension tables.\nStructure #Each row in a fact table represents a business event or transaction: a sale, a claim, a shipment, a login. Columns fall into two categories:\nForeign keys: point to dimension tables (who, what, where, when) Measures: numeric values to aggregate (amount, quantity, margin) Types of fact tables # Transaction fact: one row per event (e.g. each sale) Periodic snapshot: one row per period per entity (e.g. monthly balance per account) Accumulating snapshot: one row per process, updated at each milestone (e.g. order-shipment-invoice cycle) Relationship with SCDs #When dimensions use SCD Type 2, the fact table points to the dimension\u0026rsquo;s surrogate key — not the natural key. This ensures every fact is associated with the correct dimension version for the moment it occurred.\n","date":"1 January 0001","permalink":"https://ivanluminaria.com/en/glossary/fact-table/","section":"Glossary","summary":"\u003cp\u003eA \u003cstrong\u003efact table\u003c/strong\u003e is the central table of a star schema in a data warehouse. It contains numeric measures — amounts, quantities, counts, durations — and the foreign keys that connect it to dimension tables.\u003c/p\u003e\n\u003ch2 id=\"structure\" class=\"relative group\"\u003eStructure \u003cspan class=\"absolute top-0 w-6 transition-opacity opacity-0 -start-6 not-prose group-hover:opacity-100\"\u003e\u003ca class=\"group-hover:text-primary-300 dark:group-hover:text-neutral-700\" style=\"text-decoration-line: none !important;\" href=\"#structure\" aria-label=\"Anchor\"\u003e#\u003c/a\u003e\u003c/span\u003e\u003c/h2\u003e\u003cp\u003eEach row in a fact table represents a business event or transaction: a sale, a claim, a shipment, a login. Columns fall into two categories:\u003c/p\u003e","title":"Fact table"},{"content":"The Financial Float is the liquidity a company generates from the difference between collection times from its clients (shorter) and payment times to its suppliers (longer). It is effectively a zero-cost loan obtained at the suppliers\u0026rsquo; expense.\nHow it works #A consulting firm collects from the end client at 30 days but pays its consultants at 90 days. The 60-day difference generates a float: for every €100,000 in monthly revenue, the company has ~€200,000 of free liquidity it can invest or use as working capital.\nWhat it\u0026rsquo;s for #For large companies it is a structural financial lever. For freelance consultants it is the perverse mechanism by which they find themselves financing their clients interest-free, without guarantees and without alternatives — because the market \u0026ldquo;works this way.\u0026rdquo;\nWhy it matters #Financial float is an invisible wealth transfer from supplier to client. A consultant who works in October and gets paid in February is providing a four-month loan. Nobody calls it that — they call it \u0026ldquo;contractual terms.\u0026rdquo; But economically it is identical.\n","date":"1 January 0001","permalink":"https://ivanluminaria.com/en/glossary/float-finanziario/","section":"Glossary","summary":"\u003cp\u003eThe \u003cstrong\u003eFinancial Float\u003c/strong\u003e is the liquidity a company generates from the difference between collection times from its clients (shorter) and payment times to its suppliers (longer). It is effectively a zero-cost loan obtained at the suppliers\u0026rsquo; expense.\u003c/p\u003e\n\u003ch2 id=\"how-it-works\" class=\"relative group\"\u003eHow it works \u003cspan class=\"absolute top-0 w-6 transition-opacity opacity-0 -start-6 not-prose group-hover:opacity-100\"\u003e\u003ca class=\"group-hover:text-primary-300 dark:group-hover:text-neutral-700\" style=\"text-decoration-line: none !important;\" href=\"#how-it-works\" aria-label=\"Anchor\"\u003e#\u003c/a\u003e\u003c/span\u003e\u003c/h2\u003e\u003cp\u003eA consulting firm collects from the end client at 30 days but pays its consultants at 90 days. The 60-day difference generates a float: for every €100,000 in monthly revenue, the company has ~€200,000 of free liquidity it can invest or use as working capital.\u003c/p\u003e","title":"Financial Float"},{"content":"FLUSH PRIVILEGES is a MySQL/MariaDB command that forces the server to reload the privilege tables from the mysql database into memory. It makes permission changes immediately effective.\nHow it works #MySQL keeps an in-memory cache of the grant tables (mysql.user, mysql.db, mysql.tables_priv). When using CREATE USER and GRANT, MySQL updates both the tables and the cache automatically. But if grant tables are modified directly with INSERT, UPDATE or DELETE, the cache is not updated. FLUSH PRIVILEGES forces a cache reload from the tables.\nWhat it\u0026rsquo;s for #The command is needed after: direct deletion of users from the mysql.user table, manual privilege changes via DML, or after a DROP USER of anonymous users as part of security hardening. Without the FLUSH, changes don\u0026rsquo;t take effect until the next server restart.\nWhen to use it #After any direct modification to the grant tables. If exclusively using CREATE USER, GRANT, REVOKE and DROP USER, FLUSH is not technically necessary because these commands update the cache automatically. However, running it after a DROP USER of anonymous users is good practice to ensure consistency.\n","date":"1 January 0001","permalink":"https://ivanluminaria.com/en/glossary/flush-privileges/","section":"Glossary","summary":"\u003cp\u003e\u003cstrong\u003eFLUSH PRIVILEGES\u003c/strong\u003e is a MySQL/MariaDB command that forces the server to reload the privilege tables from the \u003ccode\u003emysql\u003c/code\u003e database into memory. It makes permission changes immediately effective.\u003c/p\u003e\n\u003ch2 id=\"how-it-works\" class=\"relative group\"\u003eHow it works \u003cspan class=\"absolute top-0 w-6 transition-opacity opacity-0 -start-6 not-prose group-hover:opacity-100\"\u003e\u003ca class=\"group-hover:text-primary-300 dark:group-hover:text-neutral-700\" style=\"text-decoration-line: none !important;\" href=\"#how-it-works\" aria-label=\"Anchor\"\u003e#\u003c/a\u003e\u003c/span\u003e\u003c/h2\u003e\u003cp\u003eMySQL keeps an in-memory cache of the grant tables (\u003ccode\u003emysql.user\u003c/code\u003e, \u003ccode\u003emysql.db\u003c/code\u003e, \u003ccode\u003emysql.tables_priv\u003c/code\u003e). When using \u003ccode\u003eCREATE USER\u003c/code\u003e and \u003ccode\u003eGRANT\u003c/code\u003e, MySQL updates both the tables and the cache automatically. But if grant tables are modified directly with \u003ccode\u003eINSERT\u003c/code\u003e, \u003ccode\u003eUPDATE\u003c/code\u003e or \u003ccode\u003eDELETE\u003c/code\u003e, the cache is not updated. \u003ccode\u003eFLUSH PRIVILEGES\u003c/code\u003e forces a cache reload from the tables.\u003c/p\u003e","title":"FLUSH PRIVILEGES"},{"content":"A Folding Bike is a bicycle designed to fold into compact dimensions (typically 60×55×25 cm) in seconds, becoming transportable as luggage. The Brompton is the best-known model, with a folding mechanism that takes 10-20 seconds.\nHow it works #A system of hinges and quick releases allows folding the frame, handlebars, and pedals into a compact package. Once folded, it goes into the office under the desk, onto the metro, or into a car trunk. In the electric version, it combines the advantages of pedal assist with total portability.\nWhat it\u0026rsquo;s for #It completely eliminates the parking problem — which in Rome can cost €35 per day and an hour and a half of searching. There is no theft risk because the bike is always with you. And on heavy rain days, it folds up and goes on the metro without issues.\nWhy it matters #The folding bike is the \u0026ldquo;superpower\u0026rdquo; that makes bike commuting practical even for those without dedicated parking, those who need public transport for part of the journey, or those working in offices without bike racks. It is the last-mile solution that cars cannot offer.\n","date":"1 January 0001","permalink":"https://ivanluminaria.com/en/glossary/bicicletta-pieghevole/","section":"Glossary","summary":"\u003cp\u003eA \u003cstrong\u003eFolding Bike\u003c/strong\u003e is a bicycle designed to fold into compact dimensions (typically 60×55×25 cm) in seconds, becoming transportable as luggage. The Brompton is the best-known model, with a folding mechanism that takes 10-20 seconds.\u003c/p\u003e\n\u003ch2 id=\"how-it-works\" class=\"relative group\"\u003eHow it works \u003cspan class=\"absolute top-0 w-6 transition-opacity opacity-0 -start-6 not-prose group-hover:opacity-100\"\u003e\u003ca class=\"group-hover:text-primary-300 dark:group-hover:text-neutral-700\" style=\"text-decoration-line: none !important;\" href=\"#how-it-works\" aria-label=\"Anchor\"\u003e#\u003c/a\u003e\u003c/span\u003e\u003c/h2\u003e\u003cp\u003eA system of hinges and quick releases allows folding the frame, handlebars, and pedals into a compact package. Once folded, it goes into the office under the desk, onto the metro, or into a car trunk. In the electric version, it combines the advantages of pedal assist with total portability.\u003c/p\u003e","title":"Folding Bike"},{"content":"Full Table Scan (or TABLE ACCESS FULL) is an operation where the database reads every data block of a table, from start to finish, without going through any index.\nHow it works #Oracle requests blocks from disk (or cache) sequentially, using multi-block reads (db file scattered read). Every row in the table is examined, regardless of whether it matches the query criteria.\nWhen it\u0026rsquo;s a problem #A full table scan on a large table is often a sign of a missing index, stale statistics, or a changed execution plan. In the AWR report it shows up as db file scattered read in the Top Wait Events section, with a high percentage of DB time.\nWhen it\u0026rsquo;s legitimate #On small tables (a few thousand rows) or when the query genuinely needs to read most of the data, a full table scan can be more efficient than an index access. The problem arises when Oracle chooses it on tables with millions of rows to extract just a few records.\nHow to identify it #In the execution plan (EXPLAIN PLAN or DBMS_XPLAN) it appears as a TABLE ACCESS FULL operation. In AWR/ASH wait events it manifests as a dominant db file scattered read.\n","date":"1 January 0001","permalink":"https://ivanluminaria.com/en/glossary/full-table-scan/","section":"Glossary","summary":"\u003cp\u003e\u003cstrong\u003eFull Table Scan\u003c/strong\u003e (or TABLE ACCESS FULL) is an operation where the database reads every data block of a table, from start to finish, without going through any index.\u003c/p\u003e\n\u003ch2 id=\"how-it-works\" class=\"relative group\"\u003eHow it works \u003cspan class=\"absolute top-0 w-6 transition-opacity opacity-0 -start-6 not-prose group-hover:opacity-100\"\u003e\u003ca class=\"group-hover:text-primary-300 dark:group-hover:text-neutral-700\" style=\"text-decoration-line: none !important;\" href=\"#how-it-works\" aria-label=\"Anchor\"\u003e#\u003c/a\u003e\u003c/span\u003e\u003c/h2\u003e\u003cp\u003eOracle requests blocks from disk (or cache) sequentially, using multi-block reads (\u003ccode\u003edb file scattered read\u003c/code\u003e). Every row in the table is examined, regardless of whether it matches the query criteria.\u003c/p\u003e","title":"Full Table Scan"},{"content":"A GIN Index (Generalized Inverted Index) is a PostgreSQL index type designed for indexing composite values: arrays, JSONB documents, text with trigrams and full-text searches. Unlike B-Tree, a GIN creates an inverted mapping: from each element (word, trigram, JSON key) to the records containing it.\nHow it works #For each distinct value in the indexed data, GIN maintains a list of pointers to the rows containing that value. In the case of pg_trgm, text is decomposed into trigrams (3-character sequences) and each trigram is indexed. A LIKE '%ABC%' search is translated into a trigram intersection, avoiding sequential scanning.\nWhat it\u0026rsquo;s for #GIN solves the \u0026ldquo;contains\u0026rdquo; search problem (LIKE '%value%') on text columns, which with a B-Tree would require a sequential scan of the entire table. On tables with millions of rows, the difference is between seconds and milliseconds.\nWhen to use it #GIN is ideal on append-only tables or those with low churn (few UPDATEs/DELETEs), as the index maintenance cost is higher than B-Tree. Creation in production should use CREATE INDEX CONCURRENTLY to avoid write locks.\n","date":"1 January 0001","permalink":"https://ivanluminaria.com/en/glossary/gin-index/","section":"Glossary","summary":"\u003cp\u003eA \u003cstrong\u003eGIN Index\u003c/strong\u003e (Generalized Inverted Index) is a PostgreSQL index type designed for indexing composite values: arrays, JSONB documents, text with trigrams and full-text searches. Unlike B-Tree, a GIN creates an inverted mapping: from each element (word, trigram, JSON key) to the records containing it.\u003c/p\u003e\n\u003ch2 id=\"how-it-works\" class=\"relative group\"\u003eHow it works \u003cspan class=\"absolute top-0 w-6 transition-opacity opacity-0 -start-6 not-prose group-hover:opacity-100\"\u003e\u003ca class=\"group-hover:text-primary-300 dark:group-hover:text-neutral-700\" style=\"text-decoration-line: none !important;\" href=\"#how-it-works\" aria-label=\"Anchor\"\u003e#\u003c/a\u003e\u003c/span\u003e\u003c/h2\u003e\u003cp\u003eFor each distinct value in the indexed data, GIN maintains a list of pointers to the rows containing that value. In the case of \u003ccode\u003epg_trgm\u003c/code\u003e, text is decomposed into trigrams (3-character sequences) and each trigram is indexed. A \u003ccode\u003eLIKE '%ABC%'\u003c/code\u003e search is translated into a trigram intersection, avoiding sequential scanning.\u003c/p\u003e","title":"GIN Index"},{"content":"","date":null,"permalink":"https://ivanluminaria.com/en/glossary/","section":"Glossary","summary":"","title":"Glossary"},{"content":"The grain (granularity) is the level of detail of a fact table in a data warehouse. It defines what a single row represents: a transaction, a daily summary, a monthly total, an invoice line.\nHow it works #Choosing the grain is the first decision when designing a fact table. Every other choice — measures, dimensions, ETL — follows from it:\nFine grain (e.g., invoice line): maximum query flexibility, more rows to manage Aggregated grain (e.g., monthly total per customer): fewer rows, faster queries, but no ability to drill into detail Kimball\u0026rsquo;s fundamental principle: always model at the finest level of detail available in the source system.\nWhat it\u0026rsquo;s for #The grain determines:\nWhich questions the data warehouse can answer Which dimensions are needed (a line-level grain requires dim_product, a monthly grain doesn\u0026rsquo;t) How large the fact table is and how long the ETL takes Whether drill-down in reports is possible or not When to use it #The grain is defined during the dimensional model design phase, before writing any DDL or ETL. Changing the grain after go-live is equivalent to rebuilding the data warehouse from scratch — which is why the initial choice is so critical.\n","date":"1 January 0001","permalink":"https://ivanluminaria.com/en/glossary/grain/","section":"Glossary","summary":"\u003cp\u003eThe \u003cstrong\u003egrain\u003c/strong\u003e (granularity) is the level of detail of a fact table in a data warehouse. It defines what a single row represents: a transaction, a daily summary, a monthly total, an invoice line.\u003c/p\u003e\n\u003ch2 id=\"how-it-works\" class=\"relative group\"\u003eHow it works \u003cspan class=\"absolute top-0 w-6 transition-opacity opacity-0 -start-6 not-prose group-hover:opacity-100\"\u003e\u003ca class=\"group-hover:text-primary-300 dark:group-hover:text-neutral-700\" style=\"text-decoration-line: none !important;\" href=\"#how-it-works\" aria-label=\"Anchor\"\u003e#\u003c/a\u003e\u003c/span\u003e\u003c/h2\u003e\u003cp\u003eChoosing the grain is the first decision when designing a fact table. Every other choice — measures, dimensions, ETL — follows from it:\u003c/p\u003e","title":"Grain"},{"content":"GRANT is the SQL command used to assign privileges to a user or role on specific database objects. In MySQL and MariaDB, privileges are assigned to the 'user'@'host' pair, not just the username.\nHow it works #The basic syntax is GRANT \u0026lt;privileges\u0026gt; ON \u0026lt;database\u0026gt;.\u0026lt;table\u0026gt; TO 'user'@'host'. Privileges can be granular (SELECT, INSERT, UPDATE, DELETE) or global (ALL PRIVILEGES). In MySQL 8, GRANT no longer creates users implicitly: an explicit CREATE USER is needed first, then GRANT. In MySQL 5.7 and MariaDB, GRANT with IDENTIFIED BY creates the user and assigns privileges in a single command.\nWhat it\u0026rsquo;s for #GRANT is the fundamental mechanism for implementing access control in MySQL/MariaDB databases. Combined with the user@host model, it allows calibrating privileges based on the connection origin: full access from localhost for the DBA, read-only from the application server.\nWhen to use it #Every time a user is created or permissions are modified. The best practice is to always assign the minimum necessary privilege (principle of least privilege) and use SHOW GRANTS FOR 'user'@'host' to verify effective privileges.\n","date":"1 January 0001","permalink":"https://ivanluminaria.com/en/glossary/grant/","section":"Glossary","summary":"\u003cp\u003e\u003cstrong\u003eGRANT\u003c/strong\u003e is the SQL command used to assign privileges to a user or role on specific database objects. In MySQL and MariaDB, privileges are assigned to the \u003ccode\u003e'user'@'host'\u003c/code\u003e pair, not just the username.\u003c/p\u003e\n\u003ch2 id=\"how-it-works\" class=\"relative group\"\u003eHow it works \u003cspan class=\"absolute top-0 w-6 transition-opacity opacity-0 -start-6 not-prose group-hover:opacity-100\"\u003e\u003ca class=\"group-hover:text-primary-300 dark:group-hover:text-neutral-700\" style=\"text-decoration-line: none !important;\" href=\"#how-it-works\" aria-label=\"Anchor\"\u003e#\u003c/a\u003e\u003c/span\u003e\u003c/h2\u003e\u003cp\u003eThe basic syntax is \u003ccode\u003eGRANT \u0026lt;privileges\u0026gt; ON \u0026lt;database\u0026gt;.\u0026lt;table\u0026gt; TO 'user'@'host'\u003c/code\u003e. Privileges can be granular (SELECT, INSERT, UPDATE, DELETE) or global (ALL PRIVILEGES). In MySQL 8, GRANT no longer creates users implicitly: an explicit \u003ccode\u003eCREATE USER\u003c/code\u003e is needed first, then GRANT. In MySQL 5.7 and MariaDB, GRANT with \u003ccode\u003eIDENTIFIED BY\u003c/code\u003e creates the user and assigns privileges in a single command.\u003c/p\u003e","title":"GRANT"},{"content":"Group Replication is MySQL\u0026rsquo;s native mechanism for creating high-availability clusters with synchronous replication across multiple nodes. Unlike classic replication (asynchronous, master-slave), Group Replication ensures every transaction is confirmed by a majority of nodes before being considered committed.\nHow it works #Nodes communicate via a group protocol (GCS — Group Communication System) that handles distributed consensus. Each node maintains a full copy of the data. Transactions are certified by the group: if there are no conflicts, they are applied on all nodes. If a conflict arises, the transaction is rolled back on the originating node.\nOperating modes #MySQL Group Replication supports two modes: single-primary (only one node accepts writes, the others are read-only) and multi-primary (all nodes accept writes). Single-primary mode is the most common in production because it avoids concurrent write conflicts.\nWhy it matters #Group Replication handles failover automatically: if the primary goes down, the cluster elects a new primary from the secondaries within seconds. This makes it suitable for environments that require high availability without manual intervention. It requires a minimum of three nodes to maintain quorum.\n","date":"1 January 0001","permalink":"https://ivanluminaria.com/en/glossary/group-replication/","section":"Glossary","summary":"\u003cp\u003e\u003cstrong\u003eGroup Replication\u003c/strong\u003e is MySQL\u0026rsquo;s native mechanism for creating high-availability clusters with synchronous replication across multiple nodes. Unlike classic replication (asynchronous, master-slave), Group Replication ensures every transaction is confirmed by a majority of nodes before being considered committed.\u003c/p\u003e\n\u003ch2 id=\"how-it-works\" class=\"relative group\"\u003eHow it works \u003cspan class=\"absolute top-0 w-6 transition-opacity opacity-0 -start-6 not-prose group-hover:opacity-100\"\u003e\u003ca class=\"group-hover:text-primary-300 dark:group-hover:text-neutral-700\" style=\"text-decoration-line: none !important;\" href=\"#how-it-works\" aria-label=\"Anchor\"\u003e#\u003c/a\u003e\u003c/span\u003e\u003c/h2\u003e\u003cp\u003eNodes communicate via a group protocol (GCS — Group Communication System) that handles distributed consensus. Each node maintains a full copy of the data. Transactions are certified by the group: if there are no conflicts, they are applied on all nodes. If a conflict arises, the transaction is rolled back on the originating node.\u003c/p\u003e","title":"Group Replication"},{"content":"GTID (Global Transaction Identifier) is a unique identifier automatically assigned to every committed transaction on a MySQL server. The format is server_uuid:transaction_id — for example 3E11FA47-71CA-11E1-9E33-C80AA9429562:23.\nHow it works #When GTID is enabled (gtid_mode = ON), every transaction receives an identifier that makes it traceable across any server in the replication cluster. The replica knows exactly which transactions it has already executed and which ones it still needs, without having to manually specify binlog positions (file + offset).\nThe set of all GTIDs executed on a server is stored in the gtid_executed variable. When a replica connects to the source, it compares its own gtid_executed with the source\u0026rsquo;s to determine which transactions are missing.\nWhat it\u0026rsquo;s for #GTID radically simplifies MySQL replication management:\nAutomatic failover: when the source goes down, a replica can become the new source and the other replicas realign automatically Consistency verification: it\u0026rsquo;s possible to verify whether two servers have executed exactly the same transactions Backup and restore: tools like mysqldump and mydumper must handle GTIDs correctly to avoid replication conflicts after restore When it causes problems #GTIDs require attention during backup and restore operations. If a dump is restored on a server with GTID enabled without correctly setting --set-gtid-purged, it can generate conflicts that break the replication chain.\n","date":"1 January 0001","permalink":"https://ivanluminaria.com/en/glossary/gtid/","section":"Glossary","summary":"\u003cp\u003e\u003cstrong\u003eGTID\u003c/strong\u003e (Global Transaction Identifier) is a unique identifier automatically assigned to every committed transaction on a MySQL server. The format is \u003ccode\u003eserver_uuid:transaction_id\u003c/code\u003e — for example \u003ccode\u003e3E11FA47-71CA-11E1-9E33-C80AA9429562:23\u003c/code\u003e.\u003c/p\u003e\n\u003ch2 id=\"how-it-works\" class=\"relative group\"\u003eHow it works \u003cspan class=\"absolute top-0 w-6 transition-opacity opacity-0 -start-6 not-prose group-hover:opacity-100\"\u003e\u003ca class=\"group-hover:text-primary-300 dark:group-hover:text-neutral-700\" style=\"text-decoration-line: none !important;\" href=\"#how-it-works\" aria-label=\"Anchor\"\u003e#\u003c/a\u003e\u003c/span\u003e\u003c/h2\u003e\u003cp\u003eWhen GTID is enabled (\u003ccode\u003egtid_mode = ON\u003c/code\u003e), every transaction receives an identifier that makes it traceable across any server in the replication cluster. The replica knows exactly which transactions it has already executed and which ones it still needs, without having to manually specify binlog positions (file + offset).\u003c/p\u003e","title":"GTID"},{"content":"Hash join is a join strategy designed for large data volumes. It works in two phases: first it builds a data structure in memory, then uses it to find matches efficiently.\nHow it works #The database reads the smaller table (build side) and builds a hash table in memory, indexing rows by the join column. Then it scans the larger table (probe side) and for each row looks up the match in the hash table with an O(1) lookup.\nThe complexity is linear — proportional to the sum of rows in both tables, not the product as in a nested loop. No indexes are needed: the hash table temporarily replaces the index.\nWhen it\u0026rsquo;s the right choice #The optimizer chooses hash join when both tables are large and there are no useful indexes, or when statistics indicate that the number of rows to combine is too high for an efficient nested loop. It\u0026rsquo;s one of the most common strategies in data warehouses and reports that aggregate millions of rows.\nWhat can go wrong #The weak point is memory. The hash table must fit in work_mem: if the smaller table doesn\u0026rsquo;t fit, the database writes batches to disk (batched hash join), with a significant performance degradation.\nwork_mem too low: the hash table is split into batches on disk, multiplying I/O Wrong estimates: the optimizer picks the wrong table as build side because statistics report fewer rows than reality Data skew: if one value in the join column dominates most rows, one hash bucket becomes huge while the rest stay empty ","date":"1 January 0001","permalink":"https://ivanluminaria.com/en/glossary/hash-join/","section":"Glossary","summary":"\u003cp\u003e\u003cstrong\u003eHash join\u003c/strong\u003e is a join strategy designed for large data volumes. It works in two phases: first it builds a data structure in memory, then uses it to find matches efficiently.\u003c/p\u003e\n\u003ch2 id=\"how-it-works\" class=\"relative group\"\u003eHow it works \u003cspan class=\"absolute top-0 w-6 transition-opacity opacity-0 -start-6 not-prose group-hover:opacity-100\"\u003e\u003ca class=\"group-hover:text-primary-300 dark:group-hover:text-neutral-700\" style=\"text-decoration-line: none !important;\" href=\"#how-it-works\" aria-label=\"Anchor\"\u003e#\u003c/a\u003e\u003c/span\u003e\u003c/h2\u003e\u003cp\u003eThe database reads the smaller table (build side) and builds a hash table in memory, indexing rows by the join column. Then it scans the larger table (probe side) and for each row looks up the match in the hash table with an O(1) lookup.\u003c/p\u003e","title":"Hash Join"},{"content":"Hot Desk (hot desking) is a workspace organization model where desks are not assigned to individual employees. Whoever comes to the office takes an available workstation, typically bookable through a digital system.\nHow it works #Instead of 50 fixed workstations for 50 employees, the company sets up 15-20 shared workstations (hot desks) equipped with monitors, docking stations, and connectivity. Employees book a workstation on the days they need to be in the office, using the other days for remote work.\nWhat it\u0026rsquo;s for #It drastically reduces real estate costs: going from 50 workstations to 15 saves approximately 70% of space and related costs (rent, utilities, cleaning, maintenance). The saved space can be converted into proper meeting rooms and collaborative areas.\nWhat can go wrong #Without an efficient booking system, conflicts and frustration arise. Employees who don\u0026rsquo;t have \u0026ldquo;their own desk\u0026rdquo; may feel less rooted in the company. The solution is combining hot desks with personal spaces (lockers) and dedicated team areas.\n","date":"1 January 0001","permalink":"https://ivanluminaria.com/en/glossary/hot-desk/","section":"Glossary","summary":"\u003cp\u003e\u003cstrong\u003eHot Desk\u003c/strong\u003e (hot desking) is a workspace organization model where desks are not assigned to individual employees. Whoever comes to the office takes an available workstation, typically bookable through a digital system.\u003c/p\u003e\n\u003ch2 id=\"how-it-works\" class=\"relative group\"\u003eHow it works \u003cspan class=\"absolute top-0 w-6 transition-opacity opacity-0 -start-6 not-prose group-hover:opacity-100\"\u003e\u003ca class=\"group-hover:text-primary-300 dark:group-hover:text-neutral-700\" style=\"text-decoration-line: none !important;\" href=\"#how-it-works\" aria-label=\"Anchor\"\u003e#\u003c/a\u003e\u003c/span\u003e\u003c/h2\u003e\u003cp\u003eInstead of 50 fixed workstations for 50 employees, the company sets up 15-20 shared workstations (hot desks) equipped with monitors, docking stations, and connectivity. Employees book a workstation on the days they need to be in the office, using the other days for remote work.\u003c/p\u003e","title":"Hot Desk"},{"content":"Huge Pages are 2 MB memory pages, compared to Linux\u0026rsquo;s standard 4 KB. For a 64 GB Oracle SGA, switching from 4 KB pages (16.7 million pages) to 2 MB Huge Pages (32,768 pages) reduces the number of Page Table entries by a factor of 500.\nHow it works #They are configured via the kernel parameter vm.nr_hugepages in /etc/sysctl.d/. The required number is calculated by dividing the SGA size by 2 MB and adding a 1.5% margin. After restarting the Oracle instance, the SGA is allocated in Huge Pages, verifiable from /proc/meminfo.\nWhat it\u0026rsquo;s for #They reduce pressure on the CPU\u0026rsquo;s TLB (Translation Lookaside Buffer), which can cache only a few thousand address translations. With normal pages, the TLB constantly overflows and the MMU must handle millions of translations — with measurable impact on latch free waits and library cache contention.\nWhy it matters #It is the single most impactful parameter for Oracle on Linux, and the one most often ignored. The installation wizard doesn\u0026rsquo;t configure it, the documentation is in an MOS note, and the system \u0026ldquo;works without it.\u0026rdquo; But the before/after metrics speak clearly: library cache hit ratio from 92% to 99.7%, CPU from 78% to 41%.\n","date":"1 January 0001","permalink":"https://ivanluminaria.com/en/glossary/huge-pages/","section":"Glossary","summary":"\u003cp\u003e\u003cstrong\u003eHuge Pages\u003c/strong\u003e are 2 MB memory pages, compared to Linux\u0026rsquo;s standard 4 KB. For a 64 GB Oracle SGA, switching from 4 KB pages (16.7 million pages) to 2 MB Huge Pages (32,768 pages) reduces the number of Page Table entries by a factor of 500.\u003c/p\u003e\n\u003ch2 id=\"how-it-works\" class=\"relative group\"\u003eHow it works \u003cspan class=\"absolute top-0 w-6 transition-opacity opacity-0 -start-6 not-prose group-hover:opacity-100\"\u003e\u003ca class=\"group-hover:text-primary-300 dark:group-hover:text-neutral-700\" style=\"text-decoration-line: none !important;\" href=\"#how-it-works\" aria-label=\"Anchor\"\u003e#\u003c/a\u003e\u003c/span\u003e\u003c/h2\u003e\u003cp\u003eThey are configured via the kernel parameter \u003ccode\u003evm.nr_hugepages\u003c/code\u003e in \u003ccode\u003e/etc/sysctl.d/\u003c/code\u003e. The required number is calculated by dividing the SGA size by 2 MB and adding a 1.5% margin. After restarting the Oracle instance, the SGA is allocated in Huge Pages, verifiable from \u003ccode\u003e/proc/meminfo\u003c/code\u003e.\u003c/p\u003e","title":"Huge Pages"},{"content":"The I/O Scheduler is the Linux kernel component that manages the queue of read and write requests to block devices (disks). It decides the execution order of requests to optimize throughput and minimize latency.\nHow it works #Linux offers several schedulers: cfq (Completely Fair Queuing, for desktops), deadline/mq-deadline (for servers and databases), noop/none (for SSD/NVMe). For Oracle the recommendation is deadline, which serves requests minimizing disk seeks. It is configured via /sys/block/sdX/queue/scheduler and made permanent via GRUB.\nWhat it\u0026rsquo;s for #The default cfq distributes I/O equally among processes — ideal for a desktop, terrible for a database that needs priority on critical I/O requests. deadline ensures no request stays in queue too long, reducing db file sequential read latency.\nWhat can go wrong #Leaving the default (cfq or bfq on some systems) means Oracle competes for I/O with all other system processes. On a dedicated database server this is wasteful: the database should have absolute priority on disk operations.\n","date":"1 January 0001","permalink":"https://ivanluminaria.com/en/glossary/io-scheduler/","section":"Glossary","summary":"\u003cp\u003eThe \u003cstrong\u003eI/O Scheduler\u003c/strong\u003e is the Linux kernel component that manages the queue of read and write requests to block devices (disks). It decides the execution order of requests to optimize throughput and minimize latency.\u003c/p\u003e\n\u003ch2 id=\"how-it-works\" class=\"relative group\"\u003eHow it works \u003cspan class=\"absolute top-0 w-6 transition-opacity opacity-0 -start-6 not-prose group-hover:opacity-100\"\u003e\u003ca class=\"group-hover:text-primary-300 dark:group-hover:text-neutral-700\" style=\"text-decoration-line: none !important;\" href=\"#how-it-works\" aria-label=\"Anchor\"\u003e#\u003c/a\u003e\u003c/span\u003e\u003c/h2\u003e\u003cp\u003eLinux offers several schedulers: \u003ccode\u003ecfq\u003c/code\u003e (Completely Fair Queuing, for desktops), \u003ccode\u003edeadline\u003c/code\u003e/\u003ccode\u003emq-deadline\u003c/code\u003e (for servers and databases), \u003ccode\u003enoop\u003c/code\u003e/\u003ccode\u003enone\u003c/code\u003e (for SSD/NVMe). For Oracle the recommendation is \u003ccode\u003edeadline\u003c/code\u003e, which serves requests minimizing disk seeks. It is configured via \u003ccode\u003e/sys/block/sdX/queue/scheduler\u003c/code\u003e and made permanent via GRUB.\u003c/p\u003e","title":"I/O Scheduler"},{"content":"INTO OUTFILE is a MySQL SQL clause that allows exporting the result of a query directly to a file on the database server\u0026rsquo;s filesystem. It is the native method for generating CSV, TSV or custom-delimited files.\nHow it works #The clause is appended to a SELECT statement and specifies the destination file path. The FIELDS TERMINATED BY, ENCLOSED BY and LINES TERMINATED BY parameters control the output format. The file is created by the MySQL system user (not the user running the query), so it must be in a directory with the correct permissions.\nWhat it\u0026rsquo;s for #INTO OUTFILE is useful for bulk data exports from the database to structured text files. It is the complement to LOAD DATA INFILE, which does the reverse operation (imports data from files). Together they form MySQL\u0026rsquo;s native mechanism for bulk import/export.\nWhen to use it #Usage is governed by the secure-file-priv directive: the destination file must be within the authorised directory. When secure-file-priv blocks the desired path, the alternative is using the mysql command-line client with -B -e and redirecting the output, which is not subject to the same restrictions.\n","date":"1 January 0001","permalink":"https://ivanluminaria.com/en/glossary/into-outfile/","section":"Glossary","summary":"\u003cp\u003e\u003cstrong\u003eINTO OUTFILE\u003c/strong\u003e is a MySQL SQL clause that allows exporting the result of a query directly to a file on the database server\u0026rsquo;s filesystem. It is the native method for generating CSV, TSV or custom-delimited files.\u003c/p\u003e\n\u003ch2 id=\"how-it-works\" class=\"relative group\"\u003eHow it works \u003cspan class=\"absolute top-0 w-6 transition-opacity opacity-0 -start-6 not-prose group-hover:opacity-100\"\u003e\u003ca class=\"group-hover:text-primary-300 dark:group-hover:text-neutral-700\" style=\"text-decoration-line: none !important;\" href=\"#how-it-works\" aria-label=\"Anchor\"\u003e#\u003c/a\u003e\u003c/span\u003e\u003c/h2\u003e\u003cp\u003eThe clause is appended to a \u003ccode\u003eSELECT\u003c/code\u003e statement and specifies the destination file path. The \u003ccode\u003eFIELDS TERMINATED BY\u003c/code\u003e, \u003ccode\u003eENCLOSED BY\u003c/code\u003e and \u003ccode\u003eLINES TERMINATED BY\u003c/code\u003e parameters control the output format. The file is created by the MySQL system user (not the user running the query), so it must be in a directory with the correct permissions.\u003c/p\u003e","title":"INTO OUTFILE"},{"content":"An Issue Tracker is a system for recording, assigning, prioritising and monitoring bugs, feature requests and project tasks. On GitHub it is integrated directly into the code repository.\nHow it works #Every problem or request is created as an \u0026ldquo;issue\u0026rdquo; with a title, description, category/priority labels and assignment to a developer. Issues can be linked to branches and Pull Requests: when a PR referencing an issue is merged, the issue closes automatically. This creates complete traceability from problem to solution.\nWhat it\u0026rsquo;s for #The issue tracker replaces emails, chat, Excel spreadsheets and verbal reports with a single, structured system. Every bug has a history: who reported it, who\u0026rsquo;s working on it, what\u0026rsquo;s the status, which code resolved it. In chaotic projects, the issue tracker is the tool that transforms confusion into visibility.\nWhen to use it #On every software project, regardless of size. The alternative — scattered reports across email, chat and Excel — is the main cause of information loss and work duplication. GitHub Issues, Jira and Linear are the most widespread platforms.\n","date":"1 January 0001","permalink":"https://ivanluminaria.com/en/glossary/issue-tracker/","section":"Glossary","summary":"\u003cp\u003eAn \u003cstrong\u003eIssue Tracker\u003c/strong\u003e is a system for recording, assigning, prioritising and monitoring bugs, feature requests and project tasks. On GitHub it is integrated directly into the code repository.\u003c/p\u003e\n\u003ch2 id=\"how-it-works\" class=\"relative group\"\u003eHow it works \u003cspan class=\"absolute top-0 w-6 transition-opacity opacity-0 -start-6 not-prose group-hover:opacity-100\"\u003e\u003ca class=\"group-hover:text-primary-300 dark:group-hover:text-neutral-700\" style=\"text-decoration-line: none !important;\" href=\"#how-it-works\" aria-label=\"Anchor\"\u003e#\u003c/a\u003e\u003c/span\u003e\u003c/h2\u003e\u003cp\u003eEvery problem or request is created as an \u0026ldquo;issue\u0026rdquo; with a title, description, category/priority labels and assignment to a developer. Issues can be linked to branches and Pull Requests: when a PR referencing an issue is merged, the issue closes automatically. This creates complete traceability from problem to solution.\u003c/p\u003e","title":"Issue Tracker"},{"content":"IST (Incremental State Transfer) is the mechanism by which a Galera node rejoining the cluster after a brief absence receives only the missing transactions, without having to download the entire dataset.\nHow it works #When a node reconnects to the cluster, the donor checks whether the missing transactions are still available in its gcache (Galera cache). If the gap is covered by the gcache, an IST is performed: only the missing transactions are sent to the node, which applies them and returns to Synced state. If the gap exceeds the gcache, Galera falls back to a full SST.\nWhat it\u0026rsquo;s for #IST makes a node\u0026rsquo;s return to the cluster much faster than a full SST. A node that has been offline for a few minutes or hours can become operational again in seconds, with no impact on cluster performance.\nWhen to use it #IST is triggered automatically when conditions allow it. The gcache size (gcache.size) determines how many transactions the cluster can keep in memory to support IST. A larger gcache allows longer node downtime without requiring an SST.\n","date":"1 January 0001","permalink":"https://ivanluminaria.com/en/glossary/ist/","section":"Glossary","summary":"\u003cp\u003e\u003cstrong\u003eIST\u003c/strong\u003e (Incremental State Transfer) is the mechanism by which a Galera node rejoining the cluster after a brief absence receives only the missing transactions, without having to download the entire dataset.\u003c/p\u003e\n\u003ch2 id=\"how-it-works\" class=\"relative group\"\u003eHow it works \u003cspan class=\"absolute top-0 w-6 transition-opacity opacity-0 -start-6 not-prose group-hover:opacity-100\"\u003e\u003ca class=\"group-hover:text-primary-300 dark:group-hover:text-neutral-700\" style=\"text-decoration-line: none !important;\" href=\"#how-it-works\" aria-label=\"Anchor\"\u003e#\u003c/a\u003e\u003c/span\u003e\u003c/h2\u003e\u003cp\u003eWhen a node reconnects to the cluster, the donor checks whether the missing transactions are still available in its gcache (Galera cache). If the gap is covered by the gcache, an IST is performed: only the missing transactions are sent to the node, which applies them and returns to Synced state. If the gap exceeds the gcache, Galera falls back to a full SST.\u003c/p\u003e","title":"IST"},{"content":"Kimball refers to Ralph Kimball and his data warehouse design methodology, described in The Data Warehouse Toolkit (first edition 1996, third edition 2013).\nThe approach #The Kimball methodology rests on three pillars:\nDimensional modeling: organizing data into star schemas with fact tables and dimension tables, optimized for analytical queries Bottom-up: building the DWH starting from individual departmental data marts, progressively integrating them through conformed dimensions Bus architecture: a framework for ensuring consistency across data marts through shared dimensions and facts Slowly Changing Dimensions #Kimball defined the SCD (Slowly Changing Dimension) classification into types 0 through 7, which has become the de facto industry standard. Type 2 — with surrogate keys and validity dates — is the most widely used for tracking dimension history.\nKimball vs Inmon #The main alternative is Bill Inmon\u0026rsquo;s methodology, which proposes a top-down approach with a normalized (3NF) enterprise data warehouse from which data marts are derived. The two methodologies are not mutually exclusive and many real-world projects adopt elements of both.\n","date":"1 January 0001","permalink":"https://ivanluminaria.com/en/glossary/kimball/","section":"Glossary","summary":"\u003cp\u003e\u003cstrong\u003eKimball\u003c/strong\u003e refers to Ralph Kimball and his data warehouse design methodology, described in \u003cem\u003eThe Data Warehouse Toolkit\u003c/em\u003e (first edition 1996, third edition 2013).\u003c/p\u003e\n\u003ch2 id=\"the-approach\" class=\"relative group\"\u003eThe approach \u003cspan class=\"absolute top-0 w-6 transition-opacity opacity-0 -start-6 not-prose group-hover:opacity-100\"\u003e\u003ca class=\"group-hover:text-primary-300 dark:group-hover:text-neutral-700\" style=\"text-decoration-line: none !important;\" href=\"#the-approach\" aria-label=\"Anchor\"\u003e#\u003c/a\u003e\u003c/span\u003e\u003c/h2\u003e\u003cp\u003eThe Kimball methodology rests on three pillars:\u003c/p\u003e\n\u003cul\u003e\n\u003cli\u003e\u003cstrong\u003eDimensional modeling\u003c/strong\u003e: organizing data into star schemas with fact tables and dimension tables, optimized for analytical queries\u003c/li\u003e\n\u003cli\u003e\u003cstrong\u003eBottom-up\u003c/strong\u003e: building the DWH starting from individual departmental data marts, progressively integrating them through conformed dimensions\u003c/li\u003e\n\u003cli\u003e\u003cstrong\u003eBus architecture\u003c/strong\u003e: a framework for ensuring consistency across data marts through shared dimensions and facts\u003c/li\u003e\n\u003c/ul\u003e\n\u003ch2 id=\"slowly-changing-dimensions\" class=\"relative group\"\u003eSlowly Changing Dimensions \u003cspan class=\"absolute top-0 w-6 transition-opacity opacity-0 -start-6 not-prose group-hover:opacity-100\"\u003e\u003ca class=\"group-hover:text-primary-300 dark:group-hover:text-neutral-700\" style=\"text-decoration-line: none !important;\" href=\"#slowly-changing-dimensions\" aria-label=\"Anchor\"\u003e#\u003c/a\u003e\u003c/span\u003e\u003c/h2\u003e\u003cp\u003eKimball defined the SCD (Slowly Changing Dimension) classification into types 0 through 7, which has become the de facto industry standard. Type 2 — with surrogate keys and validity dates — is the most widely used for tracking dimension history.\u003c/p\u003e","title":"Kimball"},{"content":"Knowledge Transfer is the process through which skills, information, and know-how are transferred from those who possess them to those who need them — between colleagues, between teams, or between people and documentation systems.\nHow it works #It can be formal (documentation, training sessions, wikis) or informal (pair programming, mentoring, shadowing). AI can accelerate knowledge transfer by generating documentation from code, commits, and issues — not perfect, but sufficient to avoid losing knowledge when someone leaves the project.\nWhat it\u0026rsquo;s for #Every IT project depends on the tacit knowledge of the people working on it. When a senior developer leaves the team without documenting their architectural decisions, the cost of the loss is invisible but enormous: weeks of reverse engineering, bugs introduced through misunderstanding, decisions repeated because nobody remembers the rationale.\nWhy it matters #It is one of the three areas where AI generates concrete value in project management. Nobody documents willingly — AI can bridge this gap. But knowledge transfer is not just documentation: it is also the ability to transfer context, motivations, and lessons learned, not just operational instructions.\n","date":"1 January 0001","permalink":"https://ivanluminaria.com/en/glossary/knowledge-transfer/","section":"Glossary","summary":"\u003cp\u003e\u003cstrong\u003eKnowledge Transfer\u003c/strong\u003e is the process through which skills, information, and know-how are transferred from those who possess them to those who need them — between colleagues, between teams, or between people and documentation systems.\u003c/p\u003e\n\u003ch2 id=\"how-it-works\" class=\"relative group\"\u003eHow it works \u003cspan class=\"absolute top-0 w-6 transition-opacity opacity-0 -start-6 not-prose group-hover:opacity-100\"\u003e\u003ca class=\"group-hover:text-primary-300 dark:group-hover:text-neutral-700\" style=\"text-decoration-line: none !important;\" href=\"#how-it-works\" aria-label=\"Anchor\"\u003e#\u003c/a\u003e\u003c/span\u003e\u003c/h2\u003e\u003cp\u003eIt can be formal (documentation, training sessions, wikis) or informal (pair programming, mentoring, shadowing). AI can accelerate knowledge transfer by generating documentation from code, commits, and issues — not perfect, but sufficient to avoid losing knowledge when someone leaves the project.\u003c/p\u003e","title":"Knowledge Transfer"},{"content":"A KPI (Key Performance Indicator) is a quantifiable metric used to evaluate the success of an activity, project, or organization against predefined objectives. In the context of remote work, KPIs replace physical presence as the indicator of productivity.\nHow it works #An effective KPI is specific, measurable, and tied to a concrete objective. In IT consulting: tickets closed, code released, SLAs met, satisfied clients. Not \u0026ldquo;hours at the desk\u0026rdquo; — because hours don\u0026rsquo;t measure value produced, they only measure time spent.\nWhat it\u0026rsquo;s for #It enables managing work by objectives instead of presence. A consultant who closes 20 tickets from home is more productive than one who closes 8 in the office. KPIs make this difference visible and measurable, removing room for subjective perception.\nWhy it matters #Companies that cannot define clear KPIs cannot adopt smart working seriously. Without objective metrics, management falls back on visual control — \u0026ldquo;if I can see you at your desk, you\u0026rsquo;re working\u0026rdquo; — which is the wrong assumption underlying presenteeism.\n","date":"1 January 0001","permalink":"https://ivanluminaria.com/en/glossary/kpi/","section":"Glossary","summary":"\u003cp\u003eA \u003cstrong\u003eKPI\u003c/strong\u003e (Key Performance Indicator) is a quantifiable metric used to evaluate the success of an activity, project, or organization against predefined objectives. In the context of remote work, KPIs replace physical presence as the indicator of productivity.\u003c/p\u003e\n\u003ch2 id=\"how-it-works\" class=\"relative group\"\u003eHow it works \u003cspan class=\"absolute top-0 w-6 transition-opacity opacity-0 -start-6 not-prose group-hover:opacity-100\"\u003e\u003ca class=\"group-hover:text-primary-300 dark:group-hover:text-neutral-700\" style=\"text-decoration-line: none !important;\" href=\"#how-it-works\" aria-label=\"Anchor\"\u003e#\u003c/a\u003e\u003c/span\u003e\u003c/h2\u003e\u003cp\u003eAn effective KPI is specific, measurable, and tied to a concrete objective. In IT consulting: tickets closed, code released, SLAs met, satisfied clients. Not \u0026ldquo;hours at the desk\u0026rdquo; — because hours don\u0026rsquo;t measure value produced, they only measure time spent.\u003c/p\u003e","title":"KPI"},{"content":"Late Payment Interest is the interest that automatically accrues on every invoice paid after the contractual due date. Under Italian Legislative Decree 231/2002 (implementing EU Directive 2011/7/EU), the rate equals the ECB rate plus 8 percentage points, without the need for formal notice.\nHow it works #From the day after the invoice due date, interest accrues automatically. The creditor is also entitled to a flat compensation of €40 per late-paid invoice for recovery costs. No formal demand letter is required — the right arises from the law itself.\nWhat it\u0026rsquo;s for #It is the primary legal tool for discouraging late payments. In theory, it should make delaying economically disadvantageous for the debtor. In Italian practice, almost no consultant claims it for fear of losing the client — which renders the tool ineffective.\nWhat can go wrong #The reputational cost of claiming late payment interest is perceived as higher than the financial benefit. A consultant who sends a formal claim is a consultant who \u0026ldquo;won\u0026rsquo;t be called again.\u0026rdquo; The system relies on the supplier\u0026rsquo;s structural docility — and it works, until it doesn\u0026rsquo;t.\n","date":"1 January 0001","permalink":"https://ivanluminaria.com/en/glossary/interessi-di-mora/","section":"Glossary","summary":"\u003cp\u003e\u003cstrong\u003eLate Payment Interest\u003c/strong\u003e is the interest that automatically accrues on every invoice paid after the contractual due date. Under Italian Legislative Decree 231/2002 (implementing EU Directive 2011/7/EU), the rate equals the ECB rate plus 8 percentage points, without the need for formal notice.\u003c/p\u003e\n\u003ch2 id=\"how-it-works\" class=\"relative group\"\u003eHow it works \u003cspan class=\"absolute top-0 w-6 transition-opacity opacity-0 -start-6 not-prose group-hover:opacity-100\"\u003e\u003ca class=\"group-hover:text-primary-300 dark:group-hover:text-neutral-700\" style=\"text-decoration-line: none !important;\" href=\"#how-it-works\" aria-label=\"Anchor\"\u003e#\u003c/a\u003e\u003c/span\u003e\u003c/h2\u003e\u003cp\u003eFrom the day after the invoice due date, interest accrues automatically. The creditor is also entitled to a flat compensation of €40 per late-paid invoice for recovery costs. No formal demand letter is required — the right arises from the law itself.\u003c/p\u003e","title":"Late Payment Interest"},{"content":"Least Privilege is a fundamental information security principle: every user, process or system should have only the permissions strictly necessary to perform their function, nothing more.\nHow it works #In the database context, the principle is applied by assigning granular privileges: SELECT if the user only needs to read, SELECT + INSERT + UPDATE if they also need to write, never ALL PRIVILEGES unless strictly necessary. Combined with MySQL\u0026rsquo;s user@host model, the principle can also be applied based on the connection origin.\nWhat it\u0026rsquo;s for #Limiting privileges reduces the attack surface. If an application is compromised, the attacker inherits the privileges of the application\u0026rsquo;s database user. If that user has only SELECT on a specific database, the damage is contained. If it has ALL PRIVILEGES, the entire server is at risk.\nWhen to use it #Always. The principle of least privilege applies in every context: database users, operating system users, application roles, service accounts. The temptation to assign broad privileges \u0026ldquo;to avoid problems\u0026rdquo; is the most common cause of avoidable security incidents.\n","date":"1 January 0001","permalink":"https://ivanluminaria.com/en/glossary/least-privilege/","section":"Glossary","summary":"\u003cp\u003e\u003cstrong\u003eLeast Privilege\u003c/strong\u003e is a fundamental information security principle: every user, process or system should have only the permissions strictly necessary to perform their function, nothing more.\u003c/p\u003e\n\u003ch2 id=\"how-it-works\" class=\"relative group\"\u003eHow it works \u003cspan class=\"absolute top-0 w-6 transition-opacity opacity-0 -start-6 not-prose group-hover:opacity-100\"\u003e\u003ca class=\"group-hover:text-primary-300 dark:group-hover:text-neutral-700\" style=\"text-decoration-line: none !important;\" href=\"#how-it-works\" aria-label=\"Anchor\"\u003e#\u003c/a\u003e\u003c/span\u003e\u003c/h2\u003e\u003cp\u003eIn the database context, the principle is applied by assigning granular privileges: \u003ccode\u003eSELECT\u003c/code\u003e if the user only needs to read, \u003ccode\u003eSELECT + INSERT + UPDATE\u003c/code\u003e if they also need to write, never \u003ccode\u003eALL PRIVILEGES\u003c/code\u003e unless strictly necessary. Combined with MySQL\u0026rsquo;s \u003ccode\u003euser@host\u003c/code\u003e model, the principle can also be applied based on the connection origin.\u003c/p\u003e","title":"Least Privilege"},{"content":"Lift-and-Shift (rehosting) is a migration strategy that consists of moving a system from one environment to another — typically from on-premises to cloud — without modifying its architecture, application code, or configuration. The system is taken as-is and \u0026ldquo;lifted and shifted.\u0026rdquo;\nHow it works #The infrastructure is replicated in the target environment: same virtual machines, same databases, same middleware. The advantage is speed: no code rewriting, no architectural redesign. The risk is carrying over all the problems from the original environment, including inefficiencies and technical debt.\nWhen to use it #When the priority is exiting a datacenter quickly (contract expiration, hardware decommission), when the budget does not allow rearchitecture, or as the first phase of an incremental migration where components are then modernized one by one.\nWhat can go wrong #A lift-and-shift to cloud without optimization can cost more than the original on-premises infrastructure. Applications not designed for cloud do not leverage elasticity, auto-scaling, and managed services. The result is often a private datacenter rebuilt in the cloud at a higher price.\n","date":"1 January 0001","permalink":"https://ivanluminaria.com/en/glossary/lift-and-shift/","section":"Glossary","summary":"\u003cp\u003e\u003cstrong\u003eLift-and-Shift\u003c/strong\u003e (rehosting) is a migration strategy that consists of moving a system from one environment to another — typically from on-premises to cloud — without modifying its architecture, application code, or configuration. The system is taken as-is and \u0026ldquo;lifted and shifted.\u0026rdquo;\u003c/p\u003e\n\u003ch2 id=\"how-it-works\" class=\"relative group\"\u003eHow it works \u003cspan class=\"absolute top-0 w-6 transition-opacity opacity-0 -start-6 not-prose group-hover:opacity-100\"\u003e\u003ca class=\"group-hover:text-primary-300 dark:group-hover:text-neutral-700\" style=\"text-decoration-line: none !important;\" href=\"#how-it-works\" aria-label=\"Anchor\"\u003e#\u003c/a\u003e\u003c/span\u003e\u003c/h2\u003e\u003cp\u003eThe infrastructure is replicated in the target environment: same virtual machines, same databases, same middleware. The advantage is speed: no code rewriting, no architectural redesign. The risk is carrying over all the problems from the original environment, including inefficiencies and technical debt.\u003c/p\u003e","title":"Lift-and-Shift"},{"content":"A Local Index is an Oracle index created on a partitioned table, which is automatically partitioned with the same key and boundaries as the table. Each table partition has a corresponding index partition.\nHow it works #When an index is created with the LOCAL clause, Oracle creates one index partition for each table partition. If the table has 100 monthly partitions, the index will have 100 corresponding partitions. DDL operations on a partition (DROP, TRUNCATE, SPLIT) invalidate only the corresponding index partition, not the entire index.\nWhat it\u0026rsquo;s for #Local Index is the preferred choice for indexes on partitioned tables because it maintains partition independence. A DROP PARTITION takes less than a second and doesn\u0026rsquo;t invalidate any other index. With a global index, the same operation would invalidate the entire index, requiring hours of rebuild.\nWhen to use it #Use when the index includes the partition key or when queries always filter on the partition column. For point lookups on non-partition columns (e.g. primary key), a global index is needed instead. The rule: local where possible, global only where necessary.\n","date":"1 January 0001","permalink":"https://ivanluminaria.com/en/glossary/local-index/","section":"Glossary","summary":"\u003cp\u003eA \u003cstrong\u003eLocal Index\u003c/strong\u003e is an Oracle index created on a partitioned table, which is automatically partitioned with the same key and boundaries as the table. Each table partition has a corresponding index partition.\u003c/p\u003e\n\u003ch2 id=\"how-it-works\" class=\"relative group\"\u003eHow it works \u003cspan class=\"absolute top-0 w-6 transition-opacity opacity-0 -start-6 not-prose group-hover:opacity-100\"\u003e\u003ca class=\"group-hover:text-primary-300 dark:group-hover:text-neutral-700\" style=\"text-decoration-line: none !important;\" href=\"#how-it-works\" aria-label=\"Anchor\"\u003e#\u003c/a\u003e\u003c/span\u003e\u003c/h2\u003e\u003cp\u003eWhen an index is created with the \u003ccode\u003eLOCAL\u003c/code\u003e clause, Oracle creates one index partition for each table partition. If the table has 100 monthly partitions, the index will have 100 corresponding partitions. DDL operations on a partition (DROP, TRUNCATE, SPLIT) invalidate only the corresponding index partition, not the entire index.\u003c/p\u003e","title":"Local Index"},{"content":"MERGE is a SQL statement that combines INSERT and UPDATE (and optionally DELETE) operations in a single statement. If the record exists it updates it, if it doesn\u0026rsquo;t it inserts it. It\u0026rsquo;s often informally called \u0026ldquo;upsert\u0026rdquo;.\nOracle syntax #MERGE INTO target_table d USING source_table s ON (d.key = s.key) WHEN MATCHED THEN UPDATE SET d.field = s.field WHEN NOT MATCHED THEN INSERT (key, field) VALUES (s.key, s.field); Use in data warehousing #In an ETL context, MERGE is the core mechanism for loading dimension tables:\nSCD Type 1: a single MERGE that updates existing records and inserts new ones SCD Type 2: MERGE is used in the first phase to close modified records (setting the end validity date), followed by an INSERT for the new versions Availability # Oracle: full support since version 9i PostgreSQL: no native MERGE until version 15. The alternative is INSERT ... ON CONFLICT (upsert) MySQL: uses INSERT ... ON DUPLICATE KEY UPDATE as an alternative SQL Server: full support with syntax similar to Oracle ","date":"1 January 0001","permalink":"https://ivanluminaria.com/en/glossary/merge-sql/","section":"Glossary","summary":"\u003cp\u003e\u003cstrong\u003eMERGE\u003c/strong\u003e is a SQL statement that combines INSERT and UPDATE (and optionally DELETE) operations in a single statement. If the record exists it updates it, if it doesn\u0026rsquo;t it inserts it. It\u0026rsquo;s often informally called \u0026ldquo;upsert\u0026rdquo;.\u003c/p\u003e\n\u003ch2 id=\"oracle-syntax\" class=\"relative group\"\u003eOracle syntax \u003cspan class=\"absolute top-0 w-6 transition-opacity opacity-0 -start-6 not-prose group-hover:opacity-100\"\u003e\u003ca class=\"group-hover:text-primary-300 dark:group-hover:text-neutral-700\" style=\"text-decoration-line: none !important;\" href=\"#oracle-syntax\" aria-label=\"Anchor\"\u003e#\u003c/a\u003e\u003c/span\u003e\u003c/h2\u003e\u003cdiv class=\"highlight\"\u003e\u003cpre tabindex=\"0\" class=\"chroma\"\u003e\u003ccode class=\"language-sql\" data-lang=\"sql\"\u003e\u003cspan class=\"line\"\u003e\u003cspan class=\"cl\"\u003e\u003cspan class=\"n\"\u003eMERGE\u003c/span\u003e\u003cspan class=\"w\"\u003e \u003c/span\u003e\u003cspan class=\"k\"\u003eINTO\u003c/span\u003e\u003cspan class=\"w\"\u003e \u003c/span\u003e\u003cspan class=\"n\"\u003etarget_table\u003c/span\u003e\u003cspan class=\"w\"\u003e \u003c/span\u003e\u003cspan class=\"n\"\u003ed\u003c/span\u003e\u003cspan class=\"w\"\u003e\n\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003cspan class=\"line\"\u003e\u003cspan class=\"cl\"\u003e\u003cspan class=\"k\"\u003eUSING\u003c/span\u003e\u003cspan class=\"w\"\u003e \u003c/span\u003e\u003cspan class=\"n\"\u003esource_table\u003c/span\u003e\u003cspan class=\"w\"\u003e \u003c/span\u003e\u003cspan class=\"n\"\u003es\u003c/span\u003e\u003cspan class=\"w\"\u003e \u003c/span\u003e\u003cspan class=\"k\"\u003eON\u003c/span\u003e\u003cspan class=\"w\"\u003e \u003c/span\u003e\u003cspan class=\"p\"\u003e(\u003c/span\u003e\u003cspan class=\"n\"\u003ed\u003c/span\u003e\u003cspan class=\"p\"\u003e.\u003c/span\u003e\u003cspan class=\"k\"\u003ekey\u003c/span\u003e\u003cspan class=\"w\"\u003e \u003c/span\u003e\u003cspan class=\"o\"\u003e=\u003c/span\u003e\u003cspan class=\"w\"\u003e \u003c/span\u003e\u003cspan class=\"n\"\u003es\u003c/span\u003e\u003cspan class=\"p\"\u003e.\u003c/span\u003e\u003cspan class=\"k\"\u003ekey\u003c/span\u003e\u003cspan class=\"p\"\u003e)\u003c/span\u003e\u003cspan class=\"w\"\u003e\n\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003cspan class=\"line\"\u003e\u003cspan class=\"cl\"\u003e\u003cspan class=\"k\"\u003eWHEN\u003c/span\u003e\u003cspan class=\"w\"\u003e \u003c/span\u003e\u003cspan class=\"n\"\u003eMATCHED\u003c/span\u003e\u003cspan class=\"w\"\u003e \u003c/span\u003e\u003cspan class=\"k\"\u003eTHEN\u003c/span\u003e\u003cspan class=\"w\"\u003e \u003c/span\u003e\u003cspan class=\"k\"\u003eUPDATE\u003c/span\u003e\u003cspan class=\"w\"\u003e \u003c/span\u003e\u003cspan class=\"k\"\u003eSET\u003c/span\u003e\u003cspan class=\"w\"\u003e\n\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003cspan class=\"line\"\u003e\u003cspan class=\"cl\"\u003e\u003cspan class=\"w\"\u003e    \u003c/span\u003e\u003cspan class=\"n\"\u003ed\u003c/span\u003e\u003cspan class=\"p\"\u003e.\u003c/span\u003e\u003cspan class=\"n\"\u003efield\u003c/span\u003e\u003cspan class=\"w\"\u003e \u003c/span\u003e\u003cspan class=\"o\"\u003e=\u003c/span\u003e\u003cspan class=\"w\"\u003e \u003c/span\u003e\u003cspan class=\"n\"\u003es\u003c/span\u003e\u003cspan class=\"p\"\u003e.\u003c/span\u003e\u003cspan class=\"n\"\u003efield\u003c/span\u003e\u003cspan class=\"w\"\u003e\n\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003cspan class=\"line\"\u003e\u003cspan class=\"cl\"\u003e\u003cspan class=\"k\"\u003eWHEN\u003c/span\u003e\u003cspan class=\"w\"\u003e \u003c/span\u003e\u003cspan class=\"k\"\u003eNOT\u003c/span\u003e\u003cspan class=\"w\"\u003e \u003c/span\u003e\u003cspan class=\"n\"\u003eMATCHED\u003c/span\u003e\u003cspan class=\"w\"\u003e \u003c/span\u003e\u003cspan class=\"k\"\u003eTHEN\u003c/span\u003e\u003cspan class=\"w\"\u003e \u003c/span\u003e\u003cspan class=\"k\"\u003eINSERT\u003c/span\u003e\u003cspan class=\"w\"\u003e \u003c/span\u003e\u003cspan class=\"p\"\u003e(\u003c/span\u003e\u003cspan class=\"k\"\u003ekey\u003c/span\u003e\u003cspan class=\"p\"\u003e,\u003c/span\u003e\u003cspan class=\"w\"\u003e \u003c/span\u003e\u003cspan class=\"n\"\u003efield\u003c/span\u003e\u003cspan class=\"p\"\u003e)\u003c/span\u003e\u003cspan class=\"w\"\u003e\n\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003cspan class=\"line\"\u003e\u003cspan class=\"cl\"\u003e\u003cspan class=\"w\"\u003e    \u003c/span\u003e\u003cspan class=\"k\"\u003eVALUES\u003c/span\u003e\u003cspan class=\"w\"\u003e \u003c/span\u003e\u003cspan class=\"p\"\u003e(\u003c/span\u003e\u003cspan class=\"n\"\u003es\u003c/span\u003e\u003cspan class=\"p\"\u003e.\u003c/span\u003e\u003cspan class=\"k\"\u003ekey\u003c/span\u003e\u003cspan class=\"p\"\u003e,\u003c/span\u003e\u003cspan class=\"w\"\u003e \u003c/span\u003e\u003cspan class=\"n\"\u003es\u003c/span\u003e\u003cspan class=\"p\"\u003e.\u003c/span\u003e\u003cspan class=\"n\"\u003efield\u003c/span\u003e\u003cspan class=\"p\"\u003e);\u003c/span\u003e\u003cspan class=\"w\"\u003e\n\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/code\u003e\u003c/pre\u003e\u003c/div\u003e\u003ch2 id=\"use-in-data-warehousing\" class=\"relative group\"\u003eUse in data warehousing \u003cspan class=\"absolute top-0 w-6 transition-opacity opacity-0 -start-6 not-prose group-hover:opacity-100\"\u003e\u003ca class=\"group-hover:text-primary-300 dark:group-hover:text-neutral-700\" style=\"text-decoration-line: none !important;\" href=\"#use-in-data-warehousing\" aria-label=\"Anchor\"\u003e#\u003c/a\u003e\u003c/span\u003e\u003c/h2\u003e\u003cp\u003eIn an ETL context, MERGE is the core mechanism for loading dimension tables:\u003c/p\u003e","title":"MERGE"},{"content":"MVCC (Multi-Version Concurrency Control) is the concurrency model used by PostgreSQL to manage simultaneous data access. Every UPDATE creates a new row version and marks the old one as \u0026ldquo;dead\u0026rdquo;; every DELETE marks the row as no longer visible. Reads don\u0026rsquo;t block writes and vice versa.\nHow it works #Each transaction sees a consistent snapshot of the database at the moment it begins. Rows modified by other uncommitted transactions are invisible. This eliminates the need for exclusive locks on reads, enabling high concurrency — but generates \u0026ldquo;garbage\u0026rdquo; in the form of dead tuples that must be cleaned up by VACUUM.\nWhat it\u0026rsquo;s for #MVCC is PostgreSQL\u0026rsquo;s architectural trade-off: high concurrency without locks, at the price of having to manage cleanup of obsolete versions. It is a reasonable price — provided autovacuum is correctly configured to keep pace with the table modification rate.\nWhy it matters #If VACUUM cannot keep up with the rate of dead tuple generation, tables bloat, sequential scans slow down, and indexes become inefficient. The classic pattern: Monday the database is fine, Friday it\u0026rsquo;s a disaster.\n","date":"1 January 0001","permalink":"https://ivanluminaria.com/en/glossary/mvcc/","section":"Glossary","summary":"\u003cp\u003e\u003cstrong\u003eMVCC\u003c/strong\u003e (Multi-Version Concurrency Control) is the concurrency model used by PostgreSQL to manage simultaneous data access. Every UPDATE creates a new row version and marks the old one as \u0026ldquo;dead\u0026rdquo;; every DELETE marks the row as no longer visible. Reads don\u0026rsquo;t block writes and vice versa.\u003c/p\u003e\n\u003ch2 id=\"how-it-works\" class=\"relative group\"\u003eHow it works \u003cspan class=\"absolute top-0 w-6 transition-opacity opacity-0 -start-6 not-prose group-hover:opacity-100\"\u003e\u003ca class=\"group-hover:text-primary-300 dark:group-hover:text-neutral-700\" style=\"text-decoration-line: none !important;\" href=\"#how-it-works\" aria-label=\"Anchor\"\u003e#\u003c/a\u003e\u003c/span\u003e\u003c/h2\u003e\u003cp\u003eEach transaction sees a consistent snapshot of the database at the moment it begins. Rows modified by other uncommitted transactions are invisible. This eliminates the need for exclusive locks on reads, enabling high concurrency — but generates \u0026ldquo;garbage\u0026rdquo; in the form of dead tuples that must be cleaned up by VACUUM.\u003c/p\u003e","title":"MVCC"},{"content":"mydumper is an open source logical backup tool for MySQL and MariaDB that implements true parallelism: not just across different tables, but also within the same table, splitting it into chunks based on the primary key.\nHow it works #mydumper connects to the MySQL server, acquires a consistent snapshot with FLUSH TABLES WITH READ LOCK (or --trx-consistency-only to avoid global locks on InnoDB), then distributes the work among multiple threads. Each large table is broken into chunks — by default based on primary key ranges — and each chunk is exported by a separate thread.\nThe output is not a single SQL file but a directory with one file per table (or per chunk), plus metadata, schema, and stored procedure files.\nRestoring with myloader #mydumper\u0026rsquo;s companion is myloader, which loads files in parallel while disabling foreign key checks and rebuilding indexes at the end. This approach makes the restore significantly faster compared to sequentially loading a single SQL file.\nWhen to use it #mydumper is the recommended choice for production databases over 10 GB where dump and restore speed is critical. On a 60 GB database with 8 threads, a dump that takes 3-4 hours with mysqldump completes in 20-25 minutes.\n","date":"1 January 0001","permalink":"https://ivanluminaria.com/en/glossary/mydumper/","section":"Glossary","summary":"\u003cp\u003e\u003cstrong\u003emydumper\u003c/strong\u003e is an open source logical backup tool for MySQL and MariaDB that implements true parallelism: not just across different tables, but also within the same table, splitting it into chunks based on the primary key.\u003c/p\u003e\n\u003ch2 id=\"how-it-works\" class=\"relative group\"\u003eHow it works \u003cspan class=\"absolute top-0 w-6 transition-opacity opacity-0 -start-6 not-prose group-hover:opacity-100\"\u003e\u003ca class=\"group-hover:text-primary-300 dark:group-hover:text-neutral-700\" style=\"text-decoration-line: none !important;\" href=\"#how-it-works\" aria-label=\"Anchor\"\u003e#\u003c/a\u003e\u003c/span\u003e\u003c/h2\u003e\u003cp\u003emydumper connects to the MySQL server, acquires a consistent snapshot with \u003ccode\u003eFLUSH TABLES WITH READ LOCK\u003c/code\u003e (or \u003ccode\u003e--trx-consistency-only\u003c/code\u003e to avoid global locks on InnoDB), then distributes the work among multiple threads. Each large table is broken into chunks — by default based on primary key ranges — and each chunk is exported by a separate thread.\u003c/p\u003e","title":"mydumper"},{"content":"mysqlbinlog is the command-line utility shipped with MySQL for reading and decoding the contents of binary log files. It is the only tool capable of converting the binary format of binlogs into readable output or re-executable SQL statements.\nHow it works #mysqlbinlog reads binlog files and produces text-format output. It supports several filters:\nBy time range: --start-datetime and --stop-datetime to limit output to a time window By database: --database to filter events for a specific database By position: --start-position and --stop-position to select specific events With ROW format, the --verbose flag decodes row-level changes into commented pseudo-SQL format, otherwise the output is an unreadable binary blob.\nWhat it\u0026rsquo;s for #mysqlbinlog is used in two main scenarios:\nPoint-in-time recovery: extracting and replaying events from backup to the desired moment, piping output directly into the mysql client Replication debugging: analysing events to understand what was replicated, identifying problematic transactions or reconstructing the sequence of operations that caused an issue When to use it #mysqlbinlog is essential whenever you need to inspect what happened in the database after an incident, or when performing a point-in-time recovery. It requires access to binlog files on the server filesystem or the ability to connect to the server with --read-from-remote-server.\n","date":"1 January 0001","permalink":"https://ivanluminaria.com/en/glossary/mysqlbinlog/","section":"Glossary","summary":"\u003cp\u003e\u003cstrong\u003emysqlbinlog\u003c/strong\u003e is the command-line utility shipped with MySQL for reading and decoding the contents of binary log files. It is the only tool capable of converting the binary format of binlogs into readable output or re-executable SQL statements.\u003c/p\u003e\n\u003ch2 id=\"how-it-works\" class=\"relative group\"\u003eHow it works \u003cspan class=\"absolute top-0 w-6 transition-opacity opacity-0 -start-6 not-prose group-hover:opacity-100\"\u003e\u003ca class=\"group-hover:text-primary-300 dark:group-hover:text-neutral-700\" style=\"text-decoration-line: none !important;\" href=\"#how-it-works\" aria-label=\"Anchor\"\u003e#\u003c/a\u003e\u003c/span\u003e\u003c/h2\u003e\u003cp\u003emysqlbinlog reads binlog files and produces text-format output. It supports several filters:\u003c/p\u003e\n\u003cul\u003e\n\u003cli\u003e\u003cstrong\u003eBy time range\u003c/strong\u003e: \u003ccode\u003e--start-datetime\u003c/code\u003e and \u003ccode\u003e--stop-datetime\u003c/code\u003e to limit output to a time window\u003c/li\u003e\n\u003cli\u003e\u003cstrong\u003eBy database\u003c/strong\u003e: \u003ccode\u003e--database\u003c/code\u003e to filter events for a specific database\u003c/li\u003e\n\u003cli\u003e\u003cstrong\u003eBy position\u003c/strong\u003e: \u003ccode\u003e--start-position\u003c/code\u003e and \u003ccode\u003e--stop-position\u003c/code\u003e to select specific events\u003c/li\u003e\n\u003c/ul\u003e\n\u003cp\u003eWith ROW format, the \u003ccode\u003e--verbose\u003c/code\u003e flag decodes row-level changes into commented pseudo-SQL format, otherwise the output is an unreadable binary blob.\u003c/p\u003e","title":"mysqlbinlog"},{"content":"mysqldump is the logical backup utility included by default in every MySQL and MariaDB installation. It produces an SQL file containing all the statements (CREATE TABLE, INSERT) needed to fully rebuild a database\u0026rsquo;s schema and data.\nHow it works #mysqldump connects to the MySQL server and reads tables one at a time, generating the corresponding SQL statements as output. The operation is strictly single-threaded: one table after another, one row after another. The output file can be compressed externally (gzip, zstd) but the tool itself offers no native compression.\nWith the --single-transaction option, the dump runs within a transaction at REPEATABLE READ isolation level, which guarantees a consistent snapshot on InnoDB tables without acquiring write locks.\nWhat it\u0026rsquo;s for #mysqldump is the standard tool for:\nLogical backup of small to medium databases Migrations between different MySQL versions Exporting individual tables or databases for transfer between environments Creating human-readable, inspectable dumps When it becomes a problem #On databases over 10-15 GB, the single-threaded dump becomes a bottleneck. A 60 GB database can require 3-4 hours for the dump and as many for the restore. The lack of parallelism is the structural limitation: there\u0026rsquo;s no way to speed up the process other than switching to tools like mydumper.\n","date":"1 January 0001","permalink":"https://ivanluminaria.com/en/glossary/mysqldump/","section":"Glossary","summary":"\u003cp\u003e\u003cstrong\u003emysqldump\u003c/strong\u003e is the logical backup utility included by default in every MySQL and MariaDB installation. It produces an SQL file containing all the statements (CREATE TABLE, INSERT) needed to fully rebuild a database\u0026rsquo;s schema and data.\u003c/p\u003e\n\u003ch2 id=\"how-it-works\" class=\"relative group\"\u003eHow it works \u003cspan class=\"absolute top-0 w-6 transition-opacity opacity-0 -start-6 not-prose group-hover:opacity-100\"\u003e\u003ca class=\"group-hover:text-primary-300 dark:group-hover:text-neutral-700\" style=\"text-decoration-line: none !important;\" href=\"#how-it-works\" aria-label=\"Anchor\"\u003e#\u003c/a\u003e\u003c/span\u003e\u003c/h2\u003e\u003cp\u003emysqldump connects to the MySQL server and reads tables one at a time, generating the corresponding SQL statements as output. The operation is strictly single-threaded: one table after another, one row after another. The output file can be compressed externally (gzip, zstd) but the tool itself offers no native compression.\u003c/p\u003e","title":"mysqldump"},{"content":"mysqlpump is the logical backup utility introduced by Oracle in MySQL 5.7 as an evolution of mysqldump. The main difference is support for table-level parallelism and native output compression (zlib, lz4, zstd).\nHow it works #mysqlpump can dump multiple tables simultaneously using parallel threads, configurable via --default-parallelism. Compression is applied directly during the dump without needing external pipes to gzip. It also supports selective dumping of MySQL users and accounts.\nHowever, the parallelism operates only at the whole-table level: if a single table is much larger than the others, one thread drags on alone while the rest have already finished.\nThe consistency problem #With parallelism enabled, mysqlpump does not guarantee consistency across different tables — tables exported by different threads may reflect different points in time. This is a critical limitation for production backups on relational databases with foreign keys.\nCurrent status #Oracle declared mysqlpump deprecated in MySQL 8.0.34 and removed it entirely in MySQL 8.4. For those seeking parallelism in logical backup, mydumper is the recommended alternative.\n","date":"1 January 0001","permalink":"https://ivanluminaria.com/en/glossary/mysqlpump/","section":"Glossary","summary":"\u003cp\u003e\u003cstrong\u003emysqlpump\u003c/strong\u003e is the logical backup utility introduced by Oracle in MySQL 5.7 as an evolution of mysqldump. The main difference is support for table-level parallelism and native output compression (zlib, lz4, zstd).\u003c/p\u003e\n\u003ch2 id=\"how-it-works\" class=\"relative group\"\u003eHow it works \u003cspan class=\"absolute top-0 w-6 transition-opacity opacity-0 -start-6 not-prose group-hover:opacity-100\"\u003e\u003ca class=\"group-hover:text-primary-300 dark:group-hover:text-neutral-700\" style=\"text-decoration-line: none !important;\" href=\"#how-it-works\" aria-label=\"Anchor\"\u003e#\u003c/a\u003e\u003c/span\u003e\u003c/h2\u003e\u003cp\u003emysqlpump can dump multiple tables simultaneously using parallel threads, configurable via \u003ccode\u003e--default-parallelism\u003c/code\u003e. Compression is applied directly during the dump without needing external pipes to gzip. It also supports selective dumping of MySQL users and accounts.\u003c/p\u003e","title":"mysqlpump"},{"content":"Nested loop is the simplest join strategy: for each row in the outer table, the database looks for matching rows in the inner table. It works like a double nested for loop — hence the name.\nHow it works #The optimizer picks one table as \u0026ldquo;outer\u0026rdquo; and one as \u0026ldquo;inner\u0026rdquo;. For each row in the outer table, it performs a lookup in the inner table on the join column. If the inner table has an index on the join column, each lookup is a direct B-tree access. Without an index, each lookup becomes a full sequential scan.\nWhen it\u0026rsquo;s the right choice #Nested loop is unbeatable when the outer table has few rows and the inner table has an index on the join column. A join on 100 outer rows with a B-tree index on the inner table is practically instantaneous: few iterations, direct access, minimal memory.\nIt\u0026rsquo;s also the preferred strategy for dimension lookups in data warehouses, where a filtered fact table (few rows) is joined with an indexed dimension table.\nWhat can go wrong #It becomes a disaster when the optimizer picks it on large datasets by mistake — typically because statistics underestimate the row count. A nested loop on 2 million outer rows means 2 million lookups in the inner table. Without an index, each lookup is a full scan.\nIn these cases a hash join or merge join would be orders of magnitude faster. The root cause is almost always a wrong cardinality estimate: stale statistics or a default_statistics_target that\u0026rsquo;s too low.\n","date":"1 January 0001","permalink":"https://ivanluminaria.com/en/glossary/nested-loop/","section":"Glossary","summary":"\u003cp\u003e\u003cstrong\u003eNested loop\u003c/strong\u003e is the simplest join strategy: for each row in the outer table, the database looks for matching rows in the inner table. It works like a double nested \u003ccode\u003efor\u003c/code\u003e loop — hence the name.\u003c/p\u003e\n\u003ch2 id=\"how-it-works\" class=\"relative group\"\u003eHow it works \u003cspan class=\"absolute top-0 w-6 transition-opacity opacity-0 -start-6 not-prose group-hover:opacity-100\"\u003e\u003ca class=\"group-hover:text-primary-300 dark:group-hover:text-neutral-700\" style=\"text-decoration-line: none !important;\" href=\"#how-it-works\" aria-label=\"Anchor\"\u003e#\u003c/a\u003e\u003c/span\u003e\u003c/h2\u003e\u003cp\u003eThe optimizer picks one table as \u0026ldquo;outer\u0026rdquo; and one as \u0026ldquo;inner\u0026rdquo;. For each row in the outer table, it performs a lookup in the inner table on the join column. If the inner table has an index on the join column, each lookup is a direct B-tree access. Without an index, each lookup becomes a full sequential scan.\u003c/p\u003e","title":"Nested Loop"},{"content":"NOLOGGING is an Oracle mode that disables redo log generation during bulk load operations. Operations complete much faster, but the data is not recoverable via redo in case of a crash before a backup is taken.\nHow it works #When a segment (table, index, partition) is in NOLOGGING mode, bulk operations like CTAS, INSERT /*+ APPEND */ and ALTER TABLE MOVE do not write redo log for data blocks. On a 380 GB copy, this eliminates the generation of the same amount of redo, preventing archivelog area saturation and reducing times from days to hours.\nWhat it\u0026rsquo;s for #NOLOGGING is essential for migration operations on large tables. Without NOLOGGING, a 380 GB CTAS would generate 380 GB of redo log, putting the system into archivelog mode for days. With NOLOGGING, the same operation completes in a few hours with minimal system impact.\nWhen to use it #Activate before the bulk operation and deactivate immediately after (ALTER TABLE ... LOGGING). An RMAN backup must be run immediately afterwards, because NOLOGGING segments are not recoverable with a restore from redo. Never leave NOLOGGING permanently active on production tables.\n","date":"1 January 0001","permalink":"https://ivanluminaria.com/en/glossary/nologging/","section":"Glossary","summary":"\u003cp\u003e\u003cstrong\u003eNOLOGGING\u003c/strong\u003e is an Oracle mode that disables redo log generation during bulk load operations. Operations complete much faster, but the data is not recoverable via redo in case of a crash before a backup is taken.\u003c/p\u003e\n\u003ch2 id=\"how-it-works\" class=\"relative group\"\u003eHow it works \u003cspan class=\"absolute top-0 w-6 transition-opacity opacity-0 -start-6 not-prose group-hover:opacity-100\"\u003e\u003ca class=\"group-hover:text-primary-300 dark:group-hover:text-neutral-700\" style=\"text-decoration-line: none !important;\" href=\"#how-it-works\" aria-label=\"Anchor\"\u003e#\u003c/a\u003e\u003c/span\u003e\u003c/h2\u003e\u003cp\u003eWhen a segment (table, index, partition) is in NOLOGGING mode, bulk operations like CTAS, \u003ccode\u003eINSERT /*+ APPEND */\u003c/code\u003e and \u003ccode\u003eALTER TABLE MOVE\u003c/code\u003e do not write redo log for data blocks. On a 380 GB copy, this eliminates the generation of the same amount of redo, preventing archivelog area saturation and reducing times from days to hours.\u003c/p\u003e","title":"NOLOGGING"},{"content":"An Object Privilege in Oracle is an authorization that allows performing operations on a specific database object: a table, view, sequence, or PL/SQL procedure. Typical examples include SELECT ON schema.table, INSERT ON schema.table, and EXECUTE ON schema.procedure.\nHow it works #Object privileges are granted with GRANT specifying the operation and the target object: GRANT SELECT ON app_owner.customers TO srv_report. They can be assigned to individual users or roles. Unlike system privileges, they operate on a single object and do not confer global powers over the database.\nWhat it\u0026rsquo;s for #Object privileges are the primary tool for implementing the principle of least privilege in Oracle. They allow building granular access models: a reporting user gets only SELECT, an application user gets SELECT + INSERT + UPDATE on operational tables, and so on. Combined with custom roles, they create clean and maintainable security architectures.\nWhy it matters #The difference between GRANT SELECT ON app_owner.customers and GRANT DBA is the difference between giving the key to one room and giving the keys to the entire building. In environments with hundreds of tables, object privileges are typically managed through PL/SQL blocks that automatically generate grants for all tables in a schema.\n","date":"1 January 0001","permalink":"https://ivanluminaria.com/en/glossary/object-privilege/","section":"Glossary","summary":"\u003cp\u003eAn \u003cstrong\u003eObject Privilege\u003c/strong\u003e in Oracle is an authorization that allows performing operations on a specific database object: a table, view, sequence, or PL/SQL procedure. Typical examples include \u003ccode\u003eSELECT ON schema.table\u003c/code\u003e, \u003ccode\u003eINSERT ON schema.table\u003c/code\u003e, and \u003ccode\u003eEXECUTE ON schema.procedure\u003c/code\u003e.\u003c/p\u003e\n\u003ch2 id=\"how-it-works\" class=\"relative group\"\u003eHow it works \u003cspan class=\"absolute top-0 w-6 transition-opacity opacity-0 -start-6 not-prose group-hover:opacity-100\"\u003e\u003ca class=\"group-hover:text-primary-300 dark:group-hover:text-neutral-700\" style=\"text-decoration-line: none !important;\" href=\"#how-it-works\" aria-label=\"Anchor\"\u003e#\u003c/a\u003e\u003c/span\u003e\u003c/h2\u003e\u003cp\u003eObject privileges are granted with \u003ccode\u003eGRANT\u003c/code\u003e specifying the operation and the target object: \u003ccode\u003eGRANT SELECT ON app_owner.customers TO srv_report\u003c/code\u003e. They can be assigned to individual users or roles. Unlike system privileges, they operate on a single object and do not confer global powers over the database.\u003c/p\u003e","title":"Object Privilege"},{"content":"OCI (Oracle Cloud Infrastructure) is Oracle\u0026rsquo;s cloud platform, launched in its second generation in 2018. Unlike other cloud providers, OCI is natively designed for Oracle Database workloads and offers significant licensing and performance advantages.\nWhy OCI for Oracle Database #The main advantage is licensing. On OCI, Oracle recognizes its own OCPUs (Oracle CPUs) with a 1:1 ratio for license counting purposes. On other cloud providers like AWS or Azure, the vCPU-to-license ratio is less favorable and the audit risk is real.\nThe BYOL (Bring Your Own License) program allows reusing existing on-premises licenses on OCI at no additional cost — a decisive factor for organizations that have already invested in Enterprise Edition licenses.\nKey services for DBAs # Bare Metal DB Systems: dedicated physical servers with pre-installed Oracle Database VM DB Systems: virtual instances with flexible configuration (Flex shapes) Exadata Cloud Service: fully managed Exadata in the cloud Autonomous Database: fully managed database with automatic tuning Networking and connectivity #OCI offers FastConnect for dedicated high-bandwidth connections between on-premises data centers and cloud regions, plus site-to-site VPN for scenarios with lower bandwidth requirements. Link latency and bandwidth are critical factors in cross-site Data Guard migrations.\n","date":"1 January 0001","permalink":"https://ivanluminaria.com/en/glossary/oci/","section":"Glossary","summary":"\u003cp\u003e\u003cstrong\u003eOCI\u003c/strong\u003e (Oracle Cloud Infrastructure) is Oracle\u0026rsquo;s cloud platform, launched in its second generation in 2018. Unlike other cloud providers, OCI is natively designed for Oracle Database workloads and offers significant licensing and performance advantages.\u003c/p\u003e\n\u003ch2 id=\"why-oci-for-oracle-database\" class=\"relative group\"\u003eWhy OCI for Oracle Database \u003cspan class=\"absolute top-0 w-6 transition-opacity opacity-0 -start-6 not-prose group-hover:opacity-100\"\u003e\u003ca class=\"group-hover:text-primary-300 dark:group-hover:text-neutral-700\" style=\"text-decoration-line: none !important;\" href=\"#why-oci-for-oracle-database\" aria-label=\"Anchor\"\u003e#\u003c/a\u003e\u003c/span\u003e\u003c/h2\u003e\u003cp\u003eThe main advantage is licensing. On OCI, Oracle recognizes its own OCPUs (Oracle CPUs) with a 1:1 ratio for license counting purposes. On other cloud providers like AWS or Azure, the vCPU-to-license ratio is less favorable and the audit risk is real.\u003c/p\u003e","title":"OCI"},{"content":"OLAP (Online Analytical Processing) refers to a data processing approach oriented to multidimensional analysis: aggregations, drill-down, time comparisons, slice-and-dice on large volumes of historical data.\nOLAP vs OLTP # Feature OLAP OLTP Purpose Analysis and reporting Operational transactions Data model Star schema, denormalized 3NF, normalized Typical query Aggregations over millions of rows Read/write of a few rows Users Analysts, management Applications, operators Updates Batch (periodic ETL) Real-time OLAP operations #The fundamental OLAP analysis operations are:\nDrill-down: from aggregated level to detail Drill-up (roll-up): from detail to aggregated level Slice: select a \u0026ldquo;slice\u0026rdquo; of data by fixing one dimension (e.g. year 2025 only) Dice: select a sub-cube by specifying multiple dimensions Pivot: rotate analysis dimensions (rows ↔ columns) Implementations # ROLAP (Relational OLAP): data stays in relational tables, aggregations are computed with SQL queries. This is the approach used in data warehouses with star schemas MOLAP (Multidimensional OLAP): data is pre-aggregated in multidimensional structures (cubes). Faster for queries but requires more space and build time HOLAP (Hybrid): combination of both approaches ","date":"1 January 0001","permalink":"https://ivanluminaria.com/en/glossary/olap/","section":"Glossary","summary":"\u003cp\u003e\u003cstrong\u003eOLAP\u003c/strong\u003e (Online Analytical Processing) refers to a data processing approach oriented to multidimensional analysis: aggregations, drill-down, time comparisons, slice-and-dice on large volumes of historical data.\u003c/p\u003e\n\u003ch2 id=\"olap-vs-oltp\" class=\"relative group\"\u003eOLAP vs OLTP \u003cspan class=\"absolute top-0 w-6 transition-opacity opacity-0 -start-6 not-prose group-hover:opacity-100\"\u003e\u003ca class=\"group-hover:text-primary-300 dark:group-hover:text-neutral-700\" style=\"text-decoration-line: none !important;\" href=\"#olap-vs-oltp\" aria-label=\"Anchor\"\u003e#\u003c/a\u003e\u003c/span\u003e\u003c/h2\u003e\u003ctable\u003e\n  \u003cthead\u003e\n      \u003ctr\u003e\n          \u003cth\u003eFeature\u003c/th\u003e\n          \u003cth\u003eOLAP\u003c/th\u003e\n          \u003cth\u003eOLTP\u003c/th\u003e\n      \u003c/tr\u003e\n  \u003c/thead\u003e\n  \u003ctbody\u003e\n      \u003ctr\u003e\n          \u003ctd\u003ePurpose\u003c/td\u003e\n          \u003ctd\u003eAnalysis and reporting\u003c/td\u003e\n          \u003ctd\u003eOperational transactions\u003c/td\u003e\n      \u003c/tr\u003e\n      \u003ctr\u003e\n          \u003ctd\u003eData model\u003c/td\u003e\n          \u003ctd\u003eStar schema, denormalized\u003c/td\u003e\n          \u003ctd\u003e3NF, normalized\u003c/td\u003e\n      \u003c/tr\u003e\n      \u003ctr\u003e\n          \u003ctd\u003eTypical query\u003c/td\u003e\n          \u003ctd\u003eAggregations over millions of rows\u003c/td\u003e\n          \u003ctd\u003eRead/write of a few rows\u003c/td\u003e\n      \u003c/tr\u003e\n      \u003ctr\u003e\n          \u003ctd\u003eUsers\u003c/td\u003e\n          \u003ctd\u003eAnalysts, management\u003c/td\u003e\n          \u003ctd\u003eApplications, operators\u003c/td\u003e\n      \u003c/tr\u003e\n      \u003ctr\u003e\n          \u003ctd\u003eUpdates\u003c/td\u003e\n          \u003ctd\u003eBatch (periodic ETL)\u003c/td\u003e\n          \u003ctd\u003eReal-time\u003c/td\u003e\n      \u003c/tr\u003e\n  \u003c/tbody\u003e\n\u003c/table\u003e\n\u003ch2 id=\"olap-operations\" class=\"relative group\"\u003eOLAP operations \u003cspan class=\"absolute top-0 w-6 transition-opacity opacity-0 -start-6 not-prose group-hover:opacity-100\"\u003e\u003ca class=\"group-hover:text-primary-300 dark:group-hover:text-neutral-700\" style=\"text-decoration-line: none !important;\" href=\"#olap-operations\" aria-label=\"Anchor\"\u003e#\u003c/a\u003e\u003c/span\u003e\u003c/h2\u003e\u003cp\u003eThe fundamental OLAP analysis operations are:\u003c/p\u003e","title":"OLAP"},{"content":"Outsourcing is the practice of entrusting the development, maintenance or management of IT systems to suppliers external to the company. It can involve complete projects (custom software development) or ongoing services (infrastructure management, application support).\nHow it works #The client company defines requirements and signs a contract with an external supplier who commits to delivering the project. The most common contract models are: fixed-price (set price for defined result), time and materials (billed man-days), or hybrid. The supplier provides a team of consultants who work on the project, often with periodic staff rotation.\nWhat it\u0026rsquo;s for #Understanding outsourcing risks is fundamental for deciding what to externalise and what to keep in-house. The main risks are: vendor lock-in, know-how loss, scope creep, consultant turnover and incentive misalignment (the supplier earns by time, not by result).\nWhen to use it #Outsourcing can work for commoditised or well-defined activities. It becomes risky for strategic, custom and long-term projects where domain-specific know-how is critical. The often more effective alternative is a small, competent internal team, possibly supported by consultants for specific specialist skills.\n","date":"1 January 0001","permalink":"https://ivanluminaria.com/en/glossary/outsourcing/","section":"Glossary","summary":"\u003cp\u003e\u003cstrong\u003eOutsourcing\u003c/strong\u003e is the practice of entrusting the development, maintenance or management of IT systems to suppliers external to the company. It can involve complete projects (custom software development) or ongoing services (infrastructure management, application support).\u003c/p\u003e\n\u003ch2 id=\"how-it-works\" class=\"relative group\"\u003eHow it works \u003cspan class=\"absolute top-0 w-6 transition-opacity opacity-0 -start-6 not-prose group-hover:opacity-100\"\u003e\u003ca class=\"group-hover:text-primary-300 dark:group-hover:text-neutral-700\" style=\"text-decoration-line: none !important;\" href=\"#how-it-works\" aria-label=\"Anchor\"\u003e#\u003c/a\u003e\u003c/span\u003e\u003c/h2\u003e\u003cp\u003eThe client company defines requirements and signs a contract with an external supplier who commits to delivering the project. The most common contract models are: fixed-price (set price for defined result), time and materials (billed man-days), or hybrid. The supplier provides a team of consultants who work on the project, often with periodic staff rotation.\u003c/p\u003e","title":"Outsourcing"},{"content":"The Parking Lot is a visible list — on a whiteboard, shared document, or chat — where the facilitator notes topics that emerge during a meeting but cannot be discussed within the available time. Topics are \u0026ldquo;parked\u0026rdquo; and addressed after the meeting with only the people involved.\nHow it works #When someone raises a complex issue during a standup, the facilitator says: \u0026ldquo;I\u0026rsquo;ll note it in the parking lot, we\u0026rsquo;ll discuss it afterwards.\u0026rdquo; The topic is not ignored — it is simply moved to the right context, where it can be addressed without wasting time for those not involved.\nWhat it\u0026rsquo;s for #It is the most underrated tool in standup and meeting management in general. It allows saying \u0026ldquo;we\u0026rsquo;ll discuss it later\u0026rdquo; without the topic being forgotten. It solves the \u0026ldquo;thread killer\u0026rdquo; problem — those discussions between two people that block the entire meeting.\nWhy it matters #Without a parking lot, the facilitator has two bad options: let the discussion expand (and the standup overruns), or cut the topic (and someone feels ignored). The parking lot offers a third way: acknowledge the topic\u0026rsquo;s importance and guarantee it will be addressed, but at the right time.\n","date":"1 January 0001","permalink":"https://ivanluminaria.com/en/glossary/parking-lot/","section":"Glossary","summary":"\u003cp\u003eThe \u003cstrong\u003eParking Lot\u003c/strong\u003e is a visible list — on a whiteboard, shared document, or chat — where the facilitator notes topics that emerge during a meeting but cannot be discussed within the available time. Topics are \u0026ldquo;parked\u0026rdquo; and addressed after the meeting with only the people involved.\u003c/p\u003e\n\u003ch2 id=\"how-it-works\" class=\"relative group\"\u003eHow it works \u003cspan class=\"absolute top-0 w-6 transition-opacity opacity-0 -start-6 not-prose group-hover:opacity-100\"\u003e\u003ca class=\"group-hover:text-primary-300 dark:group-hover:text-neutral-700\" style=\"text-decoration-line: none !important;\" href=\"#how-it-works\" aria-label=\"Anchor\"\u003e#\u003c/a\u003e\u003c/span\u003e\u003c/h2\u003e\u003cp\u003eWhen someone raises a complex issue during a standup, the facilitator says: \u0026ldquo;I\u0026rsquo;ll note it in the parking lot, we\u0026rsquo;ll discuss it afterwards.\u0026rdquo; The topic is not ignored — it is simply moved to the right context, where it can be addressed without wasting time for those not involved.\u003c/p\u003e","title":"Parking Lot"},{"content":"Partita IVA is the tax identification code assigned to self-employed workers and businesses in Italy for VAT-liable operations. In IT consulting, \u0026ldquo;working with partita IVA\u0026rdquo; means operating as a freelancer, invoicing services directly to the client.\nHow it works #The freelance consultant issues an invoice at the end of the work period. Payment follows contractual terms — which in Italy are typically 60-90-120 days end of month. Meanwhile, the consultant bears all expenses (social security contributions, taxes, rent, utilities) from personal funds.\nWhat it\u0026rsquo;s for #It is the standard regime for IT consulting in Italy. It offers flexibility and autonomy, but exposes the professional to credit risk: if the client doesn\u0026rsquo;t pay or pays late, the consultant has no protections comparable to an employee\u0026rsquo;s. There is no unemployment insurance, no severance pay, no bonus months.\nWhat can go wrong #With 90-day payment terms and a single client, the freelance consultant is the weakest link in the chain. They have no bargaining power, no legal department, and refusing conditions means having no projects. Client diversification is the primary defensive strategy.\n","date":"1 January 0001","permalink":"https://ivanluminaria.com/en/glossary/partita-iva/","section":"Glossary","summary":"\u003cp\u003e\u003cstrong\u003ePartita IVA\u003c/strong\u003e is the tax identification code assigned to self-employed workers and businesses in Italy for VAT-liable operations. In IT consulting, \u0026ldquo;working with partita IVA\u0026rdquo; means operating as a freelancer, invoicing services directly to the client.\u003c/p\u003e\n\u003ch2 id=\"how-it-works\" class=\"relative group\"\u003eHow it works \u003cspan class=\"absolute top-0 w-6 transition-opacity opacity-0 -start-6 not-prose group-hover:opacity-100\"\u003e\u003ca class=\"group-hover:text-primary-300 dark:group-hover:text-neutral-700\" style=\"text-decoration-line: none !important;\" href=\"#how-it-works\" aria-label=\"Anchor\"\u003e#\u003c/a\u003e\u003c/span\u003e\u003c/h2\u003e\u003cp\u003eThe freelance consultant issues an invoice at the end of the work period. Payment follows contractual terms — which in Italy are typically 60-90-120 days end of month. Meanwhile, the consultant bears all expenses (social security contributions, taxes, rent, utilities) from personal funds.\u003c/p\u003e","title":"Partita IVA"},{"content":"Partition Pruning is the mechanism by which Oracle, during query execution on a partitioned table, automatically identifies and excludes partitions that cannot contain data relevant to the query predicate.\nHow it works #When a query includes a predicate on the partition column (e.g. WHERE data_movimento BETWEEN ...), Oracle consults the partition metadata and determines which partitions contain data in the requested range. Only those partitions are read. In the execution plan it appears as PARTITION RANGE SINGLE or PARTITION RANGE ITERATOR.\nWhat it\u0026rsquo;s for #On a 380 GB table with monthly partitions, a query for a single month reads only ~4 GB instead of the entire table. Pruning transforms a nightmare full table scan into a manageable full partition scan, reducing I/O by 99%.\nWhen to use it #Pruning is automatic, but only works with direct predicates on the partition column. Applying functions to the column (TRUNC(date), TO_CHAR(date)) disables pruning and forces Oracle to read all partitions. Always verify with EXPLAIN PLAN that pruning is active.\n","date":"1 January 0001","permalink":"https://ivanluminaria.com/en/glossary/partition-pruning/","section":"Glossary","summary":"\u003cp\u003e\u003cstrong\u003ePartition Pruning\u003c/strong\u003e is the mechanism by which Oracle, during query execution on a partitioned table, automatically identifies and excludes partitions that cannot contain data relevant to the query predicate.\u003c/p\u003e\n\u003ch2 id=\"how-it-works\" class=\"relative group\"\u003eHow it works \u003cspan class=\"absolute top-0 w-6 transition-opacity opacity-0 -start-6 not-prose group-hover:opacity-100\"\u003e\u003ca class=\"group-hover:text-primary-300 dark:group-hover:text-neutral-700\" style=\"text-decoration-line: none !important;\" href=\"#how-it-works\" aria-label=\"Anchor\"\u003e#\u003c/a\u003e\u003c/span\u003e\u003c/h2\u003e\u003cp\u003eWhen a query includes a predicate on the partition column (e.g. \u003ccode\u003eWHERE data_movimento BETWEEN ...\u003c/code\u003e), Oracle consults the partition metadata and determines which partitions contain data in the requested range. Only those partitions are read. In the execution plan it appears as \u003ccode\u003ePARTITION RANGE SINGLE\u003c/code\u003e or \u003ccode\u003ePARTITION RANGE ITERATOR\u003c/code\u003e.\u003c/p\u003e","title":"Partition Pruning"},{"content":"Pedal Assist is an electric propulsion system mounted on a bicycle that amplifies the cyclist\u0026rsquo;s pedaling force through an electric motor. The motor activates only when pedaling and cuts off above 25 km/h (European limit).\nHow it works #A sensor detects pedaling force and cadence and activates the electric motor proportionally. The harder you pedal, the more the motor helps. The result is that climbs like the Celio hill in Rome become a gentle slope, and you arrive at your destination without breaking a sweat — a crucial detail for someone who needs to show up at the office.\nWhat it\u0026rsquo;s for #It eliminates the two main objections to cycling as urban transport: hills and sweat. With pedal assist, an 8 km city route is covered in 18 minutes regardless of elevation, arriving fresh and ready to work. Typical range is 40-80 km, more than enough for a week of commuting.\nWhy it matters #With an e-bike, the \u0026ldquo;Rome has seven hills\u0026rdquo; argument falls apart. The seven hills no longer exist with a motor assisting uphill. This makes cycling competitive with driving even in hilly cities, removing the last excuse for staying in traffic.\n","date":"1 January 0001","permalink":"https://ivanluminaria.com/en/glossary/pedalata-assistita/","section":"Glossary","summary":"\u003cp\u003e\u003cstrong\u003ePedal Assist\u003c/strong\u003e is an electric propulsion system mounted on a bicycle that amplifies the cyclist\u0026rsquo;s pedaling force through an electric motor. The motor activates only when pedaling and cuts off above 25 km/h (European limit).\u003c/p\u003e\n\u003ch2 id=\"how-it-works\" class=\"relative group\"\u003eHow it works \u003cspan class=\"absolute top-0 w-6 transition-opacity opacity-0 -start-6 not-prose group-hover:opacity-100\"\u003e\u003ca class=\"group-hover:text-primary-300 dark:group-hover:text-neutral-700\" style=\"text-decoration-line: none !important;\" href=\"#how-it-works\" aria-label=\"Anchor\"\u003e#\u003c/a\u003e\u003c/span\u003e\u003c/h2\u003e\u003cp\u003eA sensor detects pedaling force and cadence and activates the electric motor proportionally. The harder you pedal, the more the motor helps. The result is that climbs like the Celio hill in Rome become a gentle slope, and you arrive at your destination without breaking a sweat — a crucial detail for someone who needs to show up at the office.\u003c/p\u003e","title":"Pedal Assist"},{"content":"pg_stat_statements is a PostgreSQL extension — included in the official distribution but not active by default — that tracks execution statistics for all SQL queries that pass through the server. Queries are normalized (literal values replaced with parameters) to group executions of the same pattern.\nHow it works #The extension requires loading as a shared library at server startup via the shared_preload_libraries parameter. Once active, it records for each query: execution count, total and average time, rows returned, blocks read from disk and from cache. The pg_stat_statements.max parameter controls how many distinct queries are tracked (default 5000).\nWhat it\u0026rsquo;s for #It\u0026rsquo;s the primary tool for identifying the most expensive queries on a PostgreSQL server. Sorting by total_exec_time immediately gives the ranking of queries consuming the most resources. Combined with EXPLAIN ANALYZE, it enables a complete diagnostic workflow: pg_stat_statements identifies the problem, EXPLAIN explains the cause.\nWhen to use it #It should be active on any production PostgreSQL installation. The overhead is negligible (1-2% CPU). Without pg_stat_statements, any performance tuning activity is based on guesswork rather than data.\n","date":"1 January 0001","permalink":"https://ivanluminaria.com/en/glossary/pg-stat-statements/","section":"Glossary","summary":"\u003cp\u003e\u003cstrong\u003epg_stat_statements\u003c/strong\u003e is a PostgreSQL extension — included in the official distribution but not active by default — that tracks execution statistics for all SQL queries that pass through the server. Queries are normalized (literal values replaced with parameters) to group executions of the same pattern.\u003c/p\u003e\n\u003ch2 id=\"how-it-works\" class=\"relative group\"\u003eHow it works \u003cspan class=\"absolute top-0 w-6 transition-opacity opacity-0 -start-6 not-prose group-hover:opacity-100\"\u003e\u003ca class=\"group-hover:text-primary-300 dark:group-hover:text-neutral-700\" style=\"text-decoration-line: none !important;\" href=\"#how-it-works\" aria-label=\"Anchor\"\u003e#\u003c/a\u003e\u003c/span\u003e\u003c/h2\u003e\u003cp\u003eThe extension requires loading as a shared library at server startup via the \u003ccode\u003eshared_preload_libraries\u003c/code\u003e parameter. Once active, it records for each query: execution count, total and average time, rows returned, blocks read from disk and from cache. The \u003ccode\u003epg_stat_statements.max\u003c/code\u003e parameter controls how many distinct queries are tracked (default 5000).\u003c/p\u003e","title":"pg_stat_statements"},{"content":"pg_trgm is a PostgreSQL extension that implements trigram-based searching — sequences of three consecutive characters extracted from text. It enables the use of GIN and GiST indexes to accelerate LIKE '%value%' and ILIKE searches, which would otherwise require sequential scans.\nHow it works #The extension decomposes each string into trigrams: for example, \u0026ldquo;hello\u0026rdquo; becomes {\u0026quot; h\u0026quot;, \u0026quot; he\u0026quot;, \u0026ldquo;hel\u0026rdquo;, \u0026ldquo;ell\u0026rdquo;, \u0026ldquo;llo\u0026rdquo;, \u0026ldquo;lo \u0026ldquo;}. A GIN index with operator class gin_trgm_ops indexes these trigrams. When executing a LIKE '%ell%', PostgreSQL searches for matching trigrams in the index instead of scanning the entire table.\nWhat it\u0026rsquo;s for #pg_trgm solves one of the most common problems in PostgreSQL: \u0026ldquo;contains\u0026rdquo; searches on large text columns. Without pg_trgm, a LIKE '%value%' on a table with millions of rows requires a full scan. With pg_trgm and a GIN index, the same search uses the index and responds in milliseconds.\nWhen to use it #Activate with CREATE EXTENSION IF NOT EXISTS pg_trgm and create the index with USING gin (column gin_trgm_ops). It is ideal on tables with low churn (few UPDATEs/DELETEs). Index creation should use CONCURRENTLY in production to avoid locks.\n","date":"1 January 0001","permalink":"https://ivanluminaria.com/en/glossary/pg-trgm/","section":"Glossary","summary":"\u003cp\u003e\u003cstrong\u003epg_trgm\u003c/strong\u003e is a PostgreSQL extension that implements trigram-based searching — sequences of three consecutive characters extracted from text. It enables the use of GIN and GiST indexes to accelerate \u003ccode\u003eLIKE '%value%'\u003c/code\u003e and \u003ccode\u003eILIKE\u003c/code\u003e searches, which would otherwise require sequential scans.\u003c/p\u003e\n\u003ch2 id=\"how-it-works\" class=\"relative group\"\u003eHow it works \u003cspan class=\"absolute top-0 w-6 transition-opacity opacity-0 -start-6 not-prose group-hover:opacity-100\"\u003e\u003ca class=\"group-hover:text-primary-300 dark:group-hover:text-neutral-700\" style=\"text-decoration-line: none !important;\" href=\"#how-it-works\" aria-label=\"Anchor\"\u003e#\u003c/a\u003e\u003c/span\u003e\u003c/h2\u003e\u003cp\u003eThe extension decomposes each string into trigrams: for example, \u0026ldquo;hello\u0026rdquo; becomes {\u0026quot;  h\u0026quot;, \u0026quot; he\u0026quot;, \u0026ldquo;hel\u0026rdquo;, \u0026ldquo;ell\u0026rdquo;, \u0026ldquo;llo\u0026rdquo;, \u0026ldquo;lo \u0026ldquo;}. A GIN index with operator class \u003ccode\u003egin_trgm_ops\u003c/code\u003e indexes these trigrams. When executing a \u003ccode\u003eLIKE '%ell%'\u003c/code\u003e, PostgreSQL searches for matching trigrams in the index instead of scanning the entire table.\u003c/p\u003e","title":"pg_trgm"},{"content":"PITR (Point-in-Time Recovery) is a restore technique that allows bringing a database back to any moment in time, not just the moment of the backup. It relies on combining a full backup with transaction logs (binary logs in MySQL, WAL in PostgreSQL, redo logs in Oracle).\nHow it works #The process has two phases:\nBackup restore: the database is restored to the last available backup Log replay: transaction logs are replayed from the backup moment up to the desired point in time, excluding the event that caused the problem In MySQL, the mysqlbinlog tool extracts events from binary logs and replays them on the restored database.\nWhat it\u0026rsquo;s for #PITR is essential when a human error occurs (DROP TABLE, DELETE without WHERE, wrong mass UPDATE) and the database needs to be restored to the state immediately before the error, without losing the hours of work between the last backup and the incident.\nWhen to use it #PITR requires binary logging to be active and binlog files not to have been deleted. Binlog retention should cover at least twice the interval between two consecutive backups to guarantee complete PITR coverage.\n","date":"1 January 0001","permalink":"https://ivanluminaria.com/en/glossary/pitr/","section":"Glossary","summary":"\u003cp\u003e\u003cstrong\u003ePITR\u003c/strong\u003e (Point-in-Time Recovery) is a restore technique that allows bringing a database back to any moment in time, not just the moment of the backup. It relies on combining a full backup with transaction logs (binary logs in MySQL, WAL in PostgreSQL, redo logs in Oracle).\u003c/p\u003e\n\u003ch2 id=\"how-it-works\" class=\"relative group\"\u003eHow it works \u003cspan class=\"absolute top-0 w-6 transition-opacity opacity-0 -start-6 not-prose group-hover:opacity-100\"\u003e\u003ca class=\"group-hover:text-primary-300 dark:group-hover:text-neutral-700\" style=\"text-decoration-line: none !important;\" href=\"#how-it-works\" aria-label=\"Anchor\"\u003e#\u003c/a\u003e\u003c/span\u003e\u003c/h2\u003e\u003cp\u003eThe process has two phases:\u003c/p\u003e\n\u003col\u003e\n\u003cli\u003e\u003cstrong\u003eBackup restore\u003c/strong\u003e: the database is restored to the last available backup\u003c/li\u003e\n\u003cli\u003e\u003cstrong\u003eLog replay\u003c/strong\u003e: transaction logs are replayed from the backup moment up to the desired point in time, excluding the event that caused the problem\u003c/li\u003e\n\u003c/ol\u003e\n\u003cp\u003eIn MySQL, the \u003ccode\u003emysqlbinlog\u003c/code\u003e tool extracts events from binary logs and replays them on the restored database.\u003c/p\u003e","title":"PITR"},{"content":"Presenteeism is the organizational culture that measures work value based on an employee\u0026rsquo;s physical presence in the office, regardless of the quality and quantity of results produced. It is the assumption that \u0026ldquo;if I can see you at your desk, you\u0026rsquo;re working.\u0026rdquo;\nHow it works #In a presenteeist organization, being in the office from 9 to 6 matters more than closing tasks. Arriving late is a problem even if you\u0026rsquo;ve produced more than everyone else. Working from home is suspicious even if results are excellent. Control is based on sight, not metrics.\nWhy it matters #Presenteeism is the main obstacle to smart working adoption in IT consulting. An IT consultant does not work on an assembly line — they need concentration, quiet, and digital tools, not a desk in a noisy open space. Confusing presence with productivity is a cultural legacy, not an operational necessity.\nWhat can go wrong #Companies that don\u0026rsquo;t overcome presenteeism pay an invisible cost: unproductive commuting hours, employees arriving stressed and drained, talent leaving for more flexible companies. Presenteeism doesn\u0026rsquo;t protect productivity — it destroys it.\n","date":"1 January 0001","permalink":"https://ivanluminaria.com/en/glossary/presenteismo/","section":"Glossary","summary":"\u003cp\u003e\u003cstrong\u003ePresenteeism\u003c/strong\u003e is the organizational culture that measures work value based on an employee\u0026rsquo;s physical presence in the office, regardless of the quality and quantity of results produced. It is the assumption that \u0026ldquo;if I can see you at your desk, you\u0026rsquo;re working.\u0026rdquo;\u003c/p\u003e\n\u003ch2 id=\"how-it-works\" class=\"relative group\"\u003eHow it works \u003cspan class=\"absolute top-0 w-6 transition-opacity opacity-0 -start-6 not-prose group-hover:opacity-100\"\u003e\u003ca class=\"group-hover:text-primary-300 dark:group-hover:text-neutral-700\" style=\"text-decoration-line: none !important;\" href=\"#how-it-works\" aria-label=\"Anchor\"\u003e#\u003c/a\u003e\u003c/span\u003e\u003c/h2\u003e\u003cp\u003eIn a presenteeist organization, being in the office from 9 to 6 matters more than closing tasks. Arriving late is a problem even if you\u0026rsquo;ve produced more than everyone else. Working from home is suspicious even if results are excellent. Control is based on sight, not metrics.\u003c/p\u003e","title":"Presenteeism"},{"content":"A Pull Request (PR) is a formal request to incorporate changes from a development branch into the repository\u0026rsquo;s main branch. It is the central collaboration mechanism on GitHub and similar platforms.\nHow it works #The developer works on a dedicated branch (e.g. fix/issue-234-calculation-error), completes the changes, and opens a PR. The PR shows the code diff, allows colleagues to comment line by line, request changes or approve. Only after approval is the code merged into the main branch. This ensures that \u0026ldquo;good\u0026rdquo; code stays good.\nWhat it\u0026rsquo;s for #The PR transforms development from an individual activity into a team process. It prevents accidental overwrites, catches bugs before they reach production, and creates a complete history of who did what, when and why. In chaotic projects, it\u0026rsquo;s the difference between control and disorder.\nWhen to use it #On every code change, without exceptions. Even small fixes go through a PR, because the value is not just in the review but in traceability. On GitLab platforms the same functionality is called Merge Request.\n","date":"1 January 0001","permalink":"https://ivanluminaria.com/en/glossary/pull-request/","section":"Glossary","summary":"\u003cp\u003eA \u003cstrong\u003ePull Request\u003c/strong\u003e (PR) is a formal request to incorporate changes from a development branch into the repository\u0026rsquo;s main branch. It is the central collaboration mechanism on GitHub and similar platforms.\u003c/p\u003e\n\u003ch2 id=\"how-it-works\" class=\"relative group\"\u003eHow it works \u003cspan class=\"absolute top-0 w-6 transition-opacity opacity-0 -start-6 not-prose group-hover:opacity-100\"\u003e\u003ca class=\"group-hover:text-primary-300 dark:group-hover:text-neutral-700\" style=\"text-decoration-line: none !important;\" href=\"#how-it-works\" aria-label=\"Anchor\"\u003e#\u003c/a\u003e\u003c/span\u003e\u003c/h2\u003e\u003cp\u003eThe developer works on a dedicated branch (e.g. \u003ccode\u003efix/issue-234-calculation-error\u003c/code\u003e), completes the changes, and opens a PR. The PR shows the code diff, allows colleagues to comment line by line, request changes or approve. Only after approval is the code merged into the main branch. This ensures that \u0026ldquo;good\u0026rdquo; code stays good.\u003c/p\u003e","title":"Pull Request"},{"content":"Quorum is the minimum number of nodes that must agree for the cluster to be considered operational. In a 3-node cluster, the quorum is 2 (the majority). If one node disconnects, the other two recognise they are the majority and continue operating normally.\nHow it works #Galera Cluster uses a group communication protocol that continuously checks how many nodes are reachable. The calculation is simple: quorum = (total nodes / 2) + 1. With 3 nodes the quorum is 2, with 5 nodes it\u0026rsquo;s 3. Nodes that lose quorum transition to Non-Primary state and refuse writes to avoid inconsistencies.\nWhat it\u0026rsquo;s for #Quorum prevents split-brain: the situation where two parts of the cluster operate independently, accepting different writes on the same data. Without quorum, a network partition could lead to divergent data impossible to reconcile automatically.\nWhen to use it #Quorum is automatically active in any Galera cluster. This is why three nodes is the minimum in production: with two nodes, losing one leaves the survivor without quorum and therefore blocked.\n","date":"1 January 0001","permalink":"https://ivanluminaria.com/en/glossary/quorum/","section":"Glossary","summary":"\u003cp\u003e\u003cstrong\u003eQuorum\u003c/strong\u003e is the minimum number of nodes that must agree for the cluster to be considered operational. In a 3-node cluster, the quorum is 2 (the majority). If one node disconnects, the other two recognise they are the majority and continue operating normally.\u003c/p\u003e\n\u003ch2 id=\"how-it-works\" class=\"relative group\"\u003eHow it works \u003cspan class=\"absolute top-0 w-6 transition-opacity opacity-0 -start-6 not-prose group-hover:opacity-100\"\u003e\u003ca class=\"group-hover:text-primary-300 dark:group-hover:text-neutral-700\" style=\"text-decoration-line: none !important;\" href=\"#how-it-works\" aria-label=\"Anchor\"\u003e#\u003c/a\u003e\u003c/span\u003e\u003c/h2\u003e\u003cp\u003eGalera Cluster uses a group communication protocol that continuously checks how many nodes are reachable. The calculation is simple: quorum = (total nodes / 2) + 1. With 3 nodes the quorum is 2, with 5 nodes it\u0026rsquo;s 3. Nodes that lose quorum transition to Non-Primary state and refuse writes to avoid inconsistencies.\u003c/p\u003e","title":"Quorum"},{"content":"RAC (Real Application Clusters) is Oracle\u0026rsquo;s technology that allows multiple database instances to simultaneously access the same shared storage. If a node fails, the others continue serving requests without interruption — failover is transparent to applications.\nHow it works #A RAC cluster consists of two or more servers (nodes) connected via a high-speed private network (interconnect) and shared storage (typically ASM — Automatic Storage Management). Each node runs its own Oracle instance, but all access the same datafiles.\nThe Cache Fusion protocol manages data coherence across nodes: when a block modified by one node is needed by another, it\u0026rsquo;s transferred directly via the interconnect without going through disk.\nHigh availability #If a node goes down, active sessions are automatically transferred to the remaining nodes via TAF (Transparent Application Failover) or Application Continuity. Failover time depends on configuration but is typically in the order of seconds.\nLicensing implications #RAC is an Enterprise Edition option with significant licensing costs. During cloud migration, RAC license counting is one of the most sensitive aspects: on OCI with BYOL, on-premises licenses are reused; on other cloud providers, the cost can multiply.\nWhen it\u0026rsquo;s really needed #RAC is justified when automatic failover high availability and horizontal scalability are required. For environments with few users or standard uptime requirements, a single node with Data Guard is often a simpler and less expensive solution.\n","date":"1 January 0001","permalink":"https://ivanluminaria.com/en/glossary/rac/","section":"Glossary","summary":"\u003cp\u003e\u003cstrong\u003eRAC\u003c/strong\u003e (Real Application Clusters) is Oracle\u0026rsquo;s technology that allows multiple database instances to simultaneously access the same shared storage. If a node fails, the others continue serving requests without interruption — failover is transparent to applications.\u003c/p\u003e\n\u003ch2 id=\"how-it-works\" class=\"relative group\"\u003eHow it works \u003cspan class=\"absolute top-0 w-6 transition-opacity opacity-0 -start-6 not-prose group-hover:opacity-100\"\u003e\u003ca class=\"group-hover:text-primary-300 dark:group-hover:text-neutral-700\" style=\"text-decoration-line: none !important;\" href=\"#how-it-works\" aria-label=\"Anchor\"\u003e#\u003c/a\u003e\u003c/span\u003e\u003c/h2\u003e\u003cp\u003eA RAC cluster consists of two or more servers (nodes) connected via a high-speed private network (interconnect) and shared storage (typically ASM — Automatic Storage Management). Each node runs its own Oracle instance, but all access the same datafiles.\u003c/p\u003e","title":"RAC"},{"content":"A ragged hierarchy (also called unbalanced hierarchy) is a hierarchical structure where not all branches reach the same depth. Some intermediate levels are missing for certain entities.\nConcrete example #In a three-level hierarchy Top Group → Group → Client:\nSome clients have all three levels (complete hierarchy) Some clients have a Group but no Top Group Some clients have neither Group nor Top Group (direct clients) The result is a structure with \u0026ldquo;holes\u0026rdquo; that causes problems in aggregation reports: NULL rows, split totals, incomplete drill-downs.\nWhy it\u0026rsquo;s a problem in the DWH #BI tools and SQL queries expect complete hierarchies to work correctly. A GROUP BY on a column with NULLs produces unexpected results: NULL rows are grouped separately, totals don\u0026rsquo;t add up, and the same group can appear on multiple rows.\nHow to solve it #The standard technique is self-parenting: an entity without a parent becomes its own parent. This balances the hierarchy upstream, in the ETL, eliminating NULLs from the dimension table. Additional flags (is_direct_client, is_standalone_group) allow distinguishing artificially balanced entities from those with a natural hierarchy.\n","date":"1 January 0001","permalink":"https://ivanluminaria.com/en/glossary/ragged-hierarchy/","section":"Glossary","summary":"\u003cp\u003eA \u003cstrong\u003eragged hierarchy\u003c/strong\u003e (also called unbalanced hierarchy) is a hierarchical structure where not all branches reach the same depth. Some intermediate levels are missing for certain entities.\u003c/p\u003e\n\u003ch2 id=\"concrete-example\" class=\"relative group\"\u003eConcrete example \u003cspan class=\"absolute top-0 w-6 transition-opacity opacity-0 -start-6 not-prose group-hover:opacity-100\"\u003e\u003ca class=\"group-hover:text-primary-300 dark:group-hover:text-neutral-700\" style=\"text-decoration-line: none !important;\" href=\"#concrete-example\" aria-label=\"Anchor\"\u003e#\u003c/a\u003e\u003c/span\u003e\u003c/h2\u003e\u003cp\u003eIn a three-level hierarchy Top Group → Group → Client:\u003c/p\u003e\n\u003cul\u003e\n\u003cli\u003eSome clients have all three levels (complete hierarchy)\u003c/li\u003e\n\u003cli\u003eSome clients have a Group but no Top Group\u003c/li\u003e\n\u003cli\u003eSome clients have neither Group nor Top Group (direct clients)\u003c/li\u003e\n\u003c/ul\u003e\n\u003cp\u003eThe result is a structure with \u0026ldquo;holes\u0026rdquo; that causes problems in aggregation reports: NULL rows, split totals, incomplete drill-downs.\u003c/p\u003e","title":"Ragged hierarchy"},{"content":"Range Partitioning is a table partitioning strategy where rows are distributed across different partitions based on the value of a column relative to predefined ranges. The partition column is almost always a date in data warehouses.\nHow it works #Each partition is defined with a VALUES LESS THAN clause that sets the upper bound of the range. Oracle automatically assigns each row to the correct partition based on the partition column value. If a row has data_vendita = '2025-03-15', it gets inserted into the partition whose range includes that date.\nWhen to use it #Range partitioning is the natural choice when data has a dominant time dimension — fact tables in data warehouses, log tables, transaction tables. The partition granularity (daily, monthly, quarterly) depends on insert volume and query patterns: partitions that are too small create management overhead, too large and they reduce partition pruning effectiveness.\nOperational advantages #Beyond query performance, range partitioning enables data lifecycle management operations that are impossible on monolithic tables: instant partition drops (no DELETE needed), selective compression of historical partitions, movement to different storage tiers (ILM — Information Lifecycle Management), and exchange partition for zero-impact bulk loads.\n","date":"1 January 0001","permalink":"https://ivanluminaria.com/en/glossary/range-partitioning/","section":"Glossary","summary":"\u003cp\u003e\u003cstrong\u003eRange Partitioning\u003c/strong\u003e is a table partitioning strategy where rows are distributed across different partitions based on the value of a column relative to predefined ranges. The partition column is almost always a date in data warehouses.\u003c/p\u003e\n\u003ch2 id=\"how-it-works\" class=\"relative group\"\u003eHow it works \u003cspan class=\"absolute top-0 w-6 transition-opacity opacity-0 -start-6 not-prose group-hover:opacity-100\"\u003e\u003ca class=\"group-hover:text-primary-300 dark:group-hover:text-neutral-700\" style=\"text-decoration-line: none !important;\" href=\"#how-it-works\" aria-label=\"Anchor\"\u003e#\u003c/a\u003e\u003c/span\u003e\u003c/h2\u003e\u003cp\u003eEach partition is defined with a \u003ccode\u003eVALUES LESS THAN\u003c/code\u003e clause that sets the upper bound of the range. Oracle automatically assigns each row to the correct partition based on the partition column value. If a row has \u003ccode\u003edata_vendita = '2025-03-15'\u003c/code\u003e, it gets inserted into the partition whose range includes that date.\u003c/p\u003e","title":"Range Partitioning"},{"content":"Redo Log is the mechanism by which Oracle records every data modification (INSERT, UPDATE, DELETE, DDL) before it is permanently written to the datafiles. It is the fundamental guarantee of transaction durability.\nHow it works #Oracle writes changes to the online redo logs sequentially and continuously. Redo logs are organized in circular groups: when one group fills up, Oracle switches to the next. When all groups have been used, Oracle returns to the first (log switch).\nOnline vs Archived # Online redo log: the active files where Oracle writes in real time. They are circular and get overwritten Archived redo log: copies of online redo logs saved before overwriting. Required for point-in-time recovery and for Data Guard The database\u0026rsquo;s ARCHIVELOG mode enables automatic creation of archived logs. Without it, redo logs are overwritten and recovery is limited to the last full backup.\nWhy they matter #Redo logs are the heart of Oracle recovery and replication. Without redo:\nInstance recovery after a crash is not possible Point-in-time recovery (media recovery) is not possible Data Guard cannot function (replication relies entirely on redo) Flashback database is not possible ","date":"1 January 0001","permalink":"https://ivanluminaria.com/en/glossary/redo-log/","section":"Glossary","summary":"\u003cp\u003e\u003cstrong\u003eRedo Log\u003c/strong\u003e is the mechanism by which Oracle records every data modification (INSERT, UPDATE, DELETE, DDL) before it is permanently written to the datafiles. It is the fundamental guarantee of transaction durability.\u003c/p\u003e\n\u003ch2 id=\"how-it-works\" class=\"relative group\"\u003eHow it works \u003cspan class=\"absolute top-0 w-6 transition-opacity opacity-0 -start-6 not-prose group-hover:opacity-100\"\u003e\u003ca class=\"group-hover:text-primary-300 dark:group-hover:text-neutral-700\" style=\"text-decoration-line: none !important;\" href=\"#how-it-works\" aria-label=\"Anchor\"\u003e#\u003c/a\u003e\u003c/span\u003e\u003c/h2\u003e\u003cp\u003eOracle writes changes to the online redo logs sequentially and continuously. Redo logs are organized in circular groups: when one group fills up, Oracle switches to the next. When all groups have been used, Oracle returns to the first (log switch).\u003c/p\u003e","title":"Redo Log"},{"content":"The relay log is an intermediate log file present on the slave in a MySQL replication architecture. It contains events received from the master\u0026rsquo;s binary log, waiting to be executed locally by the slave\u0026rsquo;s SQL thread.\nHow it works #MySQL replication flows through the relay log in three phases:\nThe slave\u0026rsquo;s I/O thread connects to the master and reads the binary logs Received events are written to the local relay log The slave\u0026rsquo;s SQL thread reads events from the relay log and executes them on the local database This two-thread architecture decouples data reception from data application: the slave can continue receiving events from the master even if local execution is temporarily slower.\nWhat it\u0026rsquo;s for #The relay log is the mechanism that ensures replication consistency. It acts as a buffer between the master and the local application of events, allowing the slave to handle speed differences without losing data.\nWhen to use it #The relay log is created automatically when MySQL replication is configured. It doesn\u0026rsquo;t require direct manual management, but its state (current position, potential lag) is visible through SHOW REPLICA STATUS and is essential for diagnosing replication lag issues.\n","date":"1 January 0001","permalink":"https://ivanluminaria.com/en/glossary/relay-log/","section":"Glossary","summary":"\u003cp\u003eThe \u003cstrong\u003erelay log\u003c/strong\u003e is an intermediate log file present on the slave in a MySQL replication architecture. It contains events received from the master\u0026rsquo;s binary log, waiting to be executed locally by the slave\u0026rsquo;s SQL thread.\u003c/p\u003e\n\u003ch2 id=\"how-it-works\" class=\"relative group\"\u003eHow it works \u003cspan class=\"absolute top-0 w-6 transition-opacity opacity-0 -start-6 not-prose group-hover:opacity-100\"\u003e\u003ca class=\"group-hover:text-primary-300 dark:group-hover:text-neutral-700\" style=\"text-decoration-line: none !important;\" href=\"#how-it-works\" aria-label=\"Anchor\"\u003e#\u003c/a\u003e\u003c/span\u003e\u003c/h2\u003e\u003cp\u003eMySQL replication flows through the relay log in three phases:\u003c/p\u003e\n\u003col\u003e\n\u003cli\u003eThe slave\u0026rsquo;s \u003cstrong\u003eI/O thread\u003c/strong\u003e connects to the master and reads the binary logs\u003c/li\u003e\n\u003cli\u003eReceived events are written to the local relay log\u003c/li\u003e\n\u003cli\u003eThe slave\u0026rsquo;s \u003cstrong\u003eSQL thread\u003c/strong\u003e reads events from the relay log and executes them on the local database\u003c/li\u003e\n\u003c/ol\u003e\n\u003cp\u003eThis two-thread architecture decouples data reception from data application: the slave can continue receiving events from the master even if local execution is temporarily slower.\u003c/p\u003e","title":"Relay log"},{"content":"REVOKE is the SQL command that removes privileges or roles previously assigned with GRANT. It is the indispensable complement to GRANT and the primary tool for restricting permissions when a security model is restructured.\nHow it works #The syntax follows the same pattern as GRANT: REVOKE SELECT ON schema.table FROM user or REVOKE role FROM user. In Oracle, revoking a role like DBA removes in one stroke all the system privileges included in that role. Before executing a critical REVOKE, it is essential to verify that the user retains the privileges necessary for their functions.\nWhen to use it #The most common case is security model restructuring: removing excessive roles (like DBA from application users) and replacing them with calibrated custom roles. It is also used when a user changes function, when a service is decommissioned, or when an audit reveals privileges granted in excess.\nWhat can go wrong #A poorly planned REVOKE can break production applications. If an application connects with a user that loses the CREATE SESSION privilege, it stops working instantly. This is why revoking critical privileges should always be preceded by a dependency analysis and a gradual rollout plan.\n","date":"1 January 0001","permalink":"https://ivanluminaria.com/en/glossary/revoke/","section":"Glossary","summary":"\u003cp\u003e\u003cstrong\u003eREVOKE\u003c/strong\u003e is the SQL command that removes privileges or roles previously assigned with \u003ccode\u003eGRANT\u003c/code\u003e. It is the indispensable complement to GRANT and the primary tool for restricting permissions when a security model is restructured.\u003c/p\u003e\n\u003ch2 id=\"how-it-works\" class=\"relative group\"\u003eHow it works \u003cspan class=\"absolute top-0 w-6 transition-opacity opacity-0 -start-6 not-prose group-hover:opacity-100\"\u003e\u003ca class=\"group-hover:text-primary-300 dark:group-hover:text-neutral-700\" style=\"text-decoration-line: none !important;\" href=\"#how-it-works\" aria-label=\"Anchor\"\u003e#\u003c/a\u003e\u003c/span\u003e\u003c/h2\u003e\u003cp\u003eThe syntax follows the same pattern as GRANT: \u003ccode\u003eREVOKE SELECT ON schema.table FROM user\u003c/code\u003e or \u003ccode\u003eREVOKE role FROM user\u003c/code\u003e. In Oracle, revoking a role like \u003ccode\u003eDBA\u003c/code\u003e removes in one stroke all the system privileges included in that role. Before executing a critical REVOKE, it is essential to verify that the user retains the privileges necessary for their functions.\u003c/p\u003e","title":"REVOKE"},{"content":"RMAN (Recovery Manager) is Oracle\u0026rsquo;s native tool for database backup, restore and recovery. It is a command-line utility that manages all data protection operations in an integrated way with the database.\nWhat it does # Backup: full, incremental, archived log only Restore: recovery of datafiles, tablespaces or the entire database Recovery: applying redo logs to bring the database to a specific point in time Duplicate: creating database copies, including standby databases for Data Guard RMAN and Data Guard #For standby database creation, RMAN allows DUPLICATE ... FOR STANDBY FROM ACTIVE DATABASE — a direct network copy from primary to standby, with no need for intermediate tape or disk backups. The command transfers all datafiles and controlfiles and configures them automatically for replication.\nWhy RMAN over manual copies #RMAN understands the internal structure of the Oracle database: it knows which blocks have changed (for incrementals), which files are needed, how to apply redo. A manual file copy (with cp or rsync) does not guarantee consistency and requires the database to be shut down. RMAN can work with the database open, with minimal performance impact.\n","date":"1 January 0001","permalink":"https://ivanluminaria.com/en/glossary/rman/","section":"Glossary","summary":"\u003cp\u003e\u003cstrong\u003eRMAN\u003c/strong\u003e (Recovery Manager) is Oracle\u0026rsquo;s native tool for database backup, restore and recovery. It is a command-line utility that manages all data protection operations in an integrated way with the database.\u003c/p\u003e\n\u003ch2 id=\"what-it-does\" class=\"relative group\"\u003eWhat it does \u003cspan class=\"absolute top-0 w-6 transition-opacity opacity-0 -start-6 not-prose group-hover:opacity-100\"\u003e\u003ca class=\"group-hover:text-primary-300 dark:group-hover:text-neutral-700\" style=\"text-decoration-line: none !important;\" href=\"#what-it-does\" aria-label=\"Anchor\"\u003e#\u003c/a\u003e\u003c/span\u003e\u003c/h2\u003e\u003cul\u003e\n\u003cli\u003e\u003cstrong\u003eBackup\u003c/strong\u003e: full, incremental, archived log only\u003c/li\u003e\n\u003cli\u003e\u003cstrong\u003eRestore\u003c/strong\u003e: recovery of datafiles, tablespaces or the entire database\u003c/li\u003e\n\u003cli\u003e\u003cstrong\u003eRecovery\u003c/strong\u003e: applying redo logs to bring the database to a specific point in time\u003c/li\u003e\n\u003cli\u003e\u003cstrong\u003eDuplicate\u003c/strong\u003e: creating database copies, including standby databases for Data Guard\u003c/li\u003e\n\u003c/ul\u003e\n\u003ch2 id=\"rman-and-data-guard\" class=\"relative group\"\u003eRMAN and Data Guard \u003cspan class=\"absolute top-0 w-6 transition-opacity opacity-0 -start-6 not-prose group-hover:opacity-100\"\u003e\u003ca class=\"group-hover:text-primary-300 dark:group-hover:text-neutral-700\" style=\"text-decoration-line: none !important;\" href=\"#rman-and-data-guard\" aria-label=\"Anchor\"\u003e#\u003c/a\u003e\u003c/span\u003e\u003c/h2\u003e\u003cp\u003eFor standby database creation, RMAN allows \u003ccode\u003eDUPLICATE ... FOR STANDBY FROM ACTIVE DATABASE\u003c/code\u003e — a direct network copy from primary to standby, with no need for intermediate tape or disk backups. The command transfers all datafiles and controlfiles and configures them automatically for replication.\u003c/p\u003e","title":"RMAN"},{"content":"ROI (Return on Investment) is the metric that measures investment return by relating net benefit to cost incurred, expressed as a percentage. A 200% ROI means every euro invested generated two euros in return.\nHow it works #It is calculated as: (Benefit - Cost) / Cost × 100. In the context of IT projects with AI components, ROI calculation must include not only implementation costs but also maintenance, team training, governance, and model error management costs.\nWhat it\u0026rsquo;s for #ROI is the primary tool for evaluating whether an AI investment makes economic sense. But in today\u0026rsquo;s market it is also the most abused tool: vendors promising triple-digit ROI based on controlled demos, without considering real operational costs. The AI Manager is who verifies that the numbers are real, not from slides.\nWhat can go wrong #An ROI calculated only on immediate benefits without counting hidden costs (model maintenance, periodic retraining, false positive management, GDPR compliance) is a false ROI. The difference between a successful AI project and an expensive failure almost always lies in the quality of the initial ROI calculation.\n","date":"1 January 0001","permalink":"https://ivanluminaria.com/en/glossary/roi/","section":"Glossary","summary":"\u003cp\u003e\u003cstrong\u003eROI\u003c/strong\u003e (Return on Investment) is the metric that measures investment return by relating net benefit to cost incurred, expressed as a percentage. A 200% ROI means every euro invested generated two euros in return.\u003c/p\u003e\n\u003ch2 id=\"how-it-works\" class=\"relative group\"\u003eHow it works \u003cspan class=\"absolute top-0 w-6 transition-opacity opacity-0 -start-6 not-prose group-hover:opacity-100\"\u003e\u003ca class=\"group-hover:text-primary-300 dark:group-hover:text-neutral-700\" style=\"text-decoration-line: none !important;\" href=\"#how-it-works\" aria-label=\"Anchor\"\u003e#\u003c/a\u003e\u003c/span\u003e\u003c/h2\u003e\u003cp\u003eIt is calculated as: \u003ccode\u003e(Benefit - Cost) / Cost × 100\u003c/code\u003e. In the context of IT projects with AI components, ROI calculation must include not only implementation costs but also maintenance, team training, governance, and model error management costs.\u003c/p\u003e","title":"ROI"},{"content":"In PostgreSQL, ROLE is the only security entity. There is no distinction between \u0026ldquo;user\u0026rdquo; and \u0026ldquo;role\u0026rdquo;: everything is a ROLE. A ROLE with the LOGIN attribute behaves as a user; without LOGIN, it is a privilege container assignable to other ROLEs.\nHow it works #CREATE USER mario is simply a shortcut for CREATE ROLE mario WITH LOGIN. ROLEs can own objects, inherit privileges from other ROLEs through the INHERIT attribute, and be used to build permission hierarchies. A \u0026ldquo;functional\u0026rdquo; ROLE (without LOGIN) groups privileges; \u0026ldquo;user\u0026rdquo; ROLEs (with LOGIN) inherit them.\nWhat it\u0026rsquo;s for #The unified model enables designing clean security architectures: create functional ROLEs like role_readonly or role_write, assign privileges to the functional ROLEs, then assign those ROLEs to real users. When a new colleague arrives, a single GRANT role_readonly TO new_user is all it takes.\nWhy it matters #The model\u0026rsquo;s simplicity is its strength — but also a trap if misunderstood. Many administrators assign privileges directly to users instead of using functional ROLEs, creating a tangle of GRANTs impossible to maintain. The correct mental model is: privileges go to ROLEs, ROLEs go to users.\n","date":"1 January 0001","permalink":"https://ivanluminaria.com/en/glossary/postgresql-role/","section":"Glossary","summary":"\u003cp\u003eIn PostgreSQL, \u003cstrong\u003eROLE\u003c/strong\u003e is the only security entity. There is no distinction between \u0026ldquo;user\u0026rdquo; and \u0026ldquo;role\u0026rdquo;: everything is a ROLE. A ROLE with the \u003ccode\u003eLOGIN\u003c/code\u003e attribute behaves as a user; without \u003ccode\u003eLOGIN\u003c/code\u003e, it is a privilege container assignable to other ROLEs.\u003c/p\u003e\n\u003ch2 id=\"how-it-works\" class=\"relative group\"\u003eHow it works \u003cspan class=\"absolute top-0 w-6 transition-opacity opacity-0 -start-6 not-prose group-hover:opacity-100\"\u003e\u003ca class=\"group-hover:text-primary-300 dark:group-hover:text-neutral-700\" style=\"text-decoration-line: none !important;\" href=\"#how-it-works\" aria-label=\"Anchor\"\u003e#\u003c/a\u003e\u003c/span\u003e\u003c/h2\u003e\u003cp\u003e\u003ccode\u003eCREATE USER mario\u003c/code\u003e is simply a shortcut for \u003ccode\u003eCREATE ROLE mario WITH LOGIN\u003c/code\u003e. ROLEs can own objects, inherit privileges from other ROLEs through the \u003ccode\u003eINHERIT\u003c/code\u003e attribute, and be used to build permission hierarchies. A \u0026ldquo;functional\u0026rdquo; ROLE (without LOGIN) groups privileges; \u0026ldquo;user\u0026rdquo; ROLEs (with LOGIN) inherit them.\u003c/p\u003e","title":"ROLE"},{"content":"RPO (Recovery Point Objective) is the maximum amount of data an organisation can afford to lose in case of failure or disaster. It is measured in time: an RPO of 1 hour means accepting the loss of at most the last hour of transactions.\nHow it\u0026rsquo;s determined #RPO depends on the backup and replication strategy:\nStrategy Typical RPO Nightly tape backup 12-24 hours Backup + archived logs on remote storage 1-4 hours Asynchronous Data Guard (MaxPerformance) A few seconds Synchronous Data Guard (MaxAvailability) Zero RPO vs RTO #RPO and RTO are complementary but distinct:\nRPO: how much data you can lose (looks backward in time) RTO: how long it takes to restore service (looks forward in time) An organisation can have RPO=0 (zero data loss) but RTO=4 hours (it takes 4 hours to restart), or vice versa.\nWhy it matters #RPO determines the investment needed in replication infrastructure. Going from RPO=24 hours to RPO=0 can cost orders of magnitude more, but the cost must be weighed against the value of lost data — as in the case of six hours of unissued insurance policies.\n","date":"1 January 0001","permalink":"https://ivanluminaria.com/en/glossary/rpo/","section":"Glossary","summary":"\u003cp\u003e\u003cstrong\u003eRPO\u003c/strong\u003e (Recovery Point Objective) is the maximum amount of data an organisation can afford to lose in case of failure or disaster. It is measured in time: an RPO of 1 hour means accepting the loss of at most the last hour of transactions.\u003c/p\u003e\n\u003ch2 id=\"how-its-determined\" class=\"relative group\"\u003eHow it\u0026rsquo;s determined \u003cspan class=\"absolute top-0 w-6 transition-opacity opacity-0 -start-6 not-prose group-hover:opacity-100\"\u003e\u003ca class=\"group-hover:text-primary-300 dark:group-hover:text-neutral-700\" style=\"text-decoration-line: none !important;\" href=\"#how-its-determined\" aria-label=\"Anchor\"\u003e#\u003c/a\u003e\u003c/span\u003e\u003c/h2\u003e\u003cp\u003eRPO depends on the backup and replication strategy:\u003c/p\u003e\n\u003ctable\u003e\n  \u003cthead\u003e\n      \u003ctr\u003e\n          \u003cth\u003eStrategy\u003c/th\u003e\n          \u003cth\u003eTypical RPO\u003c/th\u003e\n      \u003c/tr\u003e\n  \u003c/thead\u003e\n  \u003ctbody\u003e\n      \u003ctr\u003e\n          \u003ctd\u003eNightly tape backup\u003c/td\u003e\n          \u003ctd\u003e12-24 hours\u003c/td\u003e\n      \u003c/tr\u003e\n      \u003ctr\u003e\n          \u003ctd\u003eBackup + archived logs on remote storage\u003c/td\u003e\n          \u003ctd\u003e1-4 hours\u003c/td\u003e\n      \u003c/tr\u003e\n      \u003ctr\u003e\n          \u003ctd\u003eAsynchronous Data Guard (MaxPerformance)\u003c/td\u003e\n          \u003ctd\u003eA few seconds\u003c/td\u003e\n      \u003c/tr\u003e\n      \u003ctr\u003e\n          \u003ctd\u003eSynchronous Data Guard (MaxAvailability)\u003c/td\u003e\n          \u003ctd\u003eZero\u003c/td\u003e\n      \u003c/tr\u003e\n  \u003c/tbody\u003e\n\u003c/table\u003e\n\u003ch2 id=\"rpo-vs-rto\" class=\"relative group\"\u003eRPO vs RTO \u003cspan class=\"absolute top-0 w-6 transition-opacity opacity-0 -start-6 not-prose group-hover:opacity-100\"\u003e\u003ca class=\"group-hover:text-primary-300 dark:group-hover:text-neutral-700\" style=\"text-decoration-line: none !important;\" href=\"#rpo-vs-rto\" aria-label=\"Anchor\"\u003e#\u003c/a\u003e\u003c/span\u003e\u003c/h2\u003e\u003cp\u003eRPO and RTO are complementary but distinct:\u003c/p\u003e","title":"RPO"},{"content":"RTO (Recovery Time Objective) is the maximum acceptable time to restore service after a failure or disaster. It is measured from the moment of failure to the moment the system is operational again.\nHow it\u0026rsquo;s determined #RTO depends on the recovery strategy and available infrastructure:\nStrategy Typical RTO Restore from tape backup 4-12 hours Restore from disk backup 1-4 hours Data Guard with manual switchover 1-5 minutes Data Guard with Fast-Start Failover 10-30 seconds RTO vs RPO # RTO: how long it takes to restart (looks forward) RPO: how much data you can lose (looks backward) They are independent metrics. A backup restore can have RTO=2 hours and RPO=24 hours. A synchronous Data Guard can have RTO=30 seconds and RPO=0.\nThe business impact #RTO has a direct and measurable impact: every minute of downtime translates into blocked operations, unserved customers, lost revenue. The difference between RTO=6 hours and RTO=42 seconds — as in the case of moving from single instance to Data Guard — can be worth more than the cost of the entire infrastructure.\n","date":"1 January 0001","permalink":"https://ivanluminaria.com/en/glossary/rto/","section":"Glossary","summary":"\u003cp\u003e\u003cstrong\u003eRTO\u003c/strong\u003e (Recovery Time Objective) is the maximum acceptable time to restore service after a failure or disaster. It is measured from the moment of failure to the moment the system is operational again.\u003c/p\u003e\n\u003ch2 id=\"how-its-determined\" class=\"relative group\"\u003eHow it\u0026rsquo;s determined \u003cspan class=\"absolute top-0 w-6 transition-opacity opacity-0 -start-6 not-prose group-hover:opacity-100\"\u003e\u003ca class=\"group-hover:text-primary-300 dark:group-hover:text-neutral-700\" style=\"text-decoration-line: none !important;\" href=\"#how-its-determined\" aria-label=\"Anchor\"\u003e#\u003c/a\u003e\u003c/span\u003e\u003c/h2\u003e\u003cp\u003eRTO depends on the recovery strategy and available infrastructure:\u003c/p\u003e\n\u003ctable\u003e\n  \u003cthead\u003e\n      \u003ctr\u003e\n          \u003cth\u003eStrategy\u003c/th\u003e\n          \u003cth\u003eTypical RTO\u003c/th\u003e\n      \u003c/tr\u003e\n  \u003c/thead\u003e\n  \u003ctbody\u003e\n      \u003ctr\u003e\n          \u003ctd\u003eRestore from tape backup\u003c/td\u003e\n          \u003ctd\u003e4-12 hours\u003c/td\u003e\n      \u003c/tr\u003e\n      \u003ctr\u003e\n          \u003ctd\u003eRestore from disk backup\u003c/td\u003e\n          \u003ctd\u003e1-4 hours\u003c/td\u003e\n      \u003c/tr\u003e\n      \u003ctr\u003e\n          \u003ctd\u003eData Guard with manual switchover\u003c/td\u003e\n          \u003ctd\u003e1-5 minutes\u003c/td\u003e\n      \u003c/tr\u003e\n      \u003ctr\u003e\n          \u003ctd\u003eData Guard with Fast-Start Failover\u003c/td\u003e\n          \u003ctd\u003e10-30 seconds\u003c/td\u003e\n      \u003c/tr\u003e\n  \u003c/tbody\u003e\n\u003c/table\u003e\n\u003ch2 id=\"rto-vs-rpo\" class=\"relative group\"\u003eRTO vs RPO \u003cspan class=\"absolute top-0 w-6 transition-opacity opacity-0 -start-6 not-prose group-hover:opacity-100\"\u003e\u003ca class=\"group-hover:text-primary-300 dark:group-hover:text-neutral-700\" style=\"text-decoration-line: none !important;\" href=\"#rto-vs-rpo\" aria-label=\"Anchor\"\u003e#\u003c/a\u003e\u003c/span\u003e\u003c/h2\u003e\u003cul\u003e\n\u003cli\u003e\u003cstrong\u003eRTO\u003c/strong\u003e: how long it takes to restart (looks forward)\u003c/li\u003e\n\u003cli\u003e\u003cstrong\u003eRPO\u003c/strong\u003e: how much data you can lose (looks backward)\u003c/li\u003e\n\u003c/ul\u003e\n\u003cp\u003eThey are independent metrics. A backup restore can have RTO=2 hours and RPO=24 hours. A synchronous Data Guard can have RTO=30 seconds and RPO=0.\u003c/p\u003e","title":"RTO"},{"content":"The SCAN Listener (Single Client Access Name) is the Oracle RAC component that provides a single DNS name for cluster access. Applications connect to the SCAN name without needing to know individual nodes: the listener automatically distributes connections among active nodes.\nHow it works #SCAN is a DNS name that resolves to three virtual IP addresses (VIPs) distributed across cluster nodes. When a client connects to the SCAN name, DNS returns one of the three IPs, and the listener on that IP redirects the connection to the most appropriate node based on the requested service and load.\nThe advantage is that application connection strings never change: if a node is added to or removed from the cluster, SCAN handles everything transparently.\nTypical configuration #A connection string using SCAN:\njdbc:oracle:thin:@//scan-name.example.com:1521/service_name The three SCAN VIPs run on any cluster node. In a two-node cluster, one node hosts two VIPs and the other hosts one (or vice versa).\nIn migrations #In OCI migrations, the SCAN listener is reconfigured with the new infrastructure\u0026rsquo;s DNS. It\u0026rsquo;s one of the cutover steps: updating connection strings to point to the new SCAN name on OCI. If naming is well managed, it\u0026rsquo;s a change in a single place (the application\u0026rsquo;s connection pool), not in dozens of scattered configuration files.\n","date":"1 January 0001","permalink":"https://ivanluminaria.com/en/glossary/scan-listener/","section":"Glossary","summary":"\u003cp\u003eThe \u003cstrong\u003eSCAN Listener\u003c/strong\u003e (Single Client Access Name) is the Oracle RAC component that provides a single DNS name for cluster access. Applications connect to the SCAN name without needing to know individual nodes: the listener automatically distributes connections among active nodes.\u003c/p\u003e\n\u003ch2 id=\"how-it-works\" class=\"relative group\"\u003eHow it works \u003cspan class=\"absolute top-0 w-6 transition-opacity opacity-0 -start-6 not-prose group-hover:opacity-100\"\u003e\u003ca class=\"group-hover:text-primary-300 dark:group-hover:text-neutral-700\" style=\"text-decoration-line: none !important;\" href=\"#how-it-works\" aria-label=\"Anchor\"\u003e#\u003c/a\u003e\u003c/span\u003e\u003c/h2\u003e\u003cp\u003eSCAN is a DNS name that resolves to three virtual IP addresses (VIPs) distributed across cluster nodes. When a client connects to the SCAN name, DNS returns one of the three IPs, and the listener on that IP redirects the connection to the most appropriate node based on the requested service and load.\u003c/p\u003e","title":"SCAN Listener"},{"content":"SCD (Slowly Changing Dimension) refers to a set of techniques used in data warehousing to manage changes in dimension table data over time.\nMain types # Type 1: overwrite the previous value. No history preserved Type 2: insert a new row with validity dates (start date, end date). Preserves full history Type 3: add a column for the previous value. Preserves only the last change Why it matters #In a transactional database, when a customer changes address you update the record. In a data warehouse this would mean losing history: all previous sales would appear associated with the new address.\nSCD Type 2 solves this problem by maintaining one row for each version of the data, with validity dates that allow reconstructing the situation at any point in time.\nWhen to use it #The choice of type depends on the business requirement. If only the current value matters, Type 1 is sufficient. If the business needs accurate historical analysis — and in most real-world data warehouses it does — Type 2 is the standard choice.\n","date":"1 January 0001","permalink":"https://ivanluminaria.com/en/glossary/scd/","section":"Glossary","summary":"\u003cp\u003e\u003cstrong\u003eSCD\u003c/strong\u003e (Slowly Changing Dimension) refers to a set of techniques used in data warehousing to manage changes in dimension table data over time.\u003c/p\u003e\n\u003ch2 id=\"main-types\" class=\"relative group\"\u003eMain types \u003cspan class=\"absolute top-0 w-6 transition-opacity opacity-0 -start-6 not-prose group-hover:opacity-100\"\u003e\u003ca class=\"group-hover:text-primary-300 dark:group-hover:text-neutral-700\" style=\"text-decoration-line: none !important;\" href=\"#main-types\" aria-label=\"Anchor\"\u003e#\u003c/a\u003e\u003c/span\u003e\u003c/h2\u003e\u003cul\u003e\n\u003cli\u003e\u003cstrong\u003eType 1\u003c/strong\u003e: overwrite the previous value. No history preserved\u003c/li\u003e\n\u003cli\u003e\u003cstrong\u003eType 2\u003c/strong\u003e: insert a new row with validity dates (start date, end date). Preserves full history\u003c/li\u003e\n\u003cli\u003e\u003cstrong\u003eType 3\u003c/strong\u003e: add a column for the previous value. Preserves only the last change\u003c/li\u003e\n\u003c/ul\u003e\n\u003ch2 id=\"why-it-matters\" class=\"relative group\"\u003eWhy it matters \u003cspan class=\"absolute top-0 w-6 transition-opacity opacity-0 -start-6 not-prose group-hover:opacity-100\"\u003e\u003ca class=\"group-hover:text-primary-300 dark:group-hover:text-neutral-700\" style=\"text-decoration-line: none !important;\" href=\"#why-it-matters\" aria-label=\"Anchor\"\u003e#\u003c/a\u003e\u003c/span\u003e\u003c/h2\u003e\u003cp\u003eIn a transactional database, when a customer changes address you update the record. In a data warehouse this would mean losing history: all previous sales would appear associated with the new address.\u003c/p\u003e","title":"SCD"},{"content":"A Schema in a relational database is a logical namespace that groups objects such as tables, views, functions, and sequences. It functions as an organizational container within a database.\nHow it works #In PostgreSQL, the default schema is public. To access an object in another schema, the prefix is required: schema1.table. The USAGE privilege on a schema is a prerequisite for accessing any object within it — without USAGE, even a GRANT SELECT on tables does not work.\nWhat it\u0026rsquo;s for #Schemas allow logical data separation: one schema for the application, one for reporting, one for staging tables. In Oracle, the concept is different: each user is automatically a schema, and objects created by that user live in their schema. In PostgreSQL, schemas and users are independent entities.\nWhy it matters #Schema permission management is the most common source of errors when creating users with limited access. Forgetting GRANT USAGE ON SCHEMA is the classic mistake that generates \u0026ldquo;permission denied for schema\u0026rdquo; even when table permissions are correct.\n","date":"1 January 0001","permalink":"https://ivanluminaria.com/en/glossary/schema/","section":"Glossary","summary":"\u003cp\u003eA \u003cstrong\u003eSchema\u003c/strong\u003e in a relational database is a logical namespace that groups objects such as tables, views, functions, and sequences. It functions as an organizational container within a database.\u003c/p\u003e\n\u003ch2 id=\"how-it-works\" class=\"relative group\"\u003eHow it works \u003cspan class=\"absolute top-0 w-6 transition-opacity opacity-0 -start-6 not-prose group-hover:opacity-100\"\u003e\u003ca class=\"group-hover:text-primary-300 dark:group-hover:text-neutral-700\" style=\"text-decoration-line: none !important;\" href=\"#how-it-works\" aria-label=\"Anchor\"\u003e#\u003c/a\u003e\u003c/span\u003e\u003c/h2\u003e\u003cp\u003eIn PostgreSQL, the default schema is \u003ccode\u003epublic\u003c/code\u003e. To access an object in another schema, the prefix is required: \u003ccode\u003eschema1.table\u003c/code\u003e. The \u003ccode\u003eUSAGE\u003c/code\u003e privilege on a schema is a prerequisite for accessing any object within it — without \u003ccode\u003eUSAGE\u003c/code\u003e, even a \u003ccode\u003eGRANT SELECT\u003c/code\u003e on tables does not work.\u003c/p\u003e","title":"Schema"},{"content":"A project\u0026rsquo;s Scope defines the perimeter of what the project must deliver: included features, expected deliverables, constraints, and boundaries agreed with stakeholders. Everything inside the scope gets done; everything outside does not.\nHow it works #Scope is defined in the early project phases through documents like the Statement of Work or Project Charter. Any subsequent change request must go through a formal change management process to evaluate its impact on timeline, budget, and resources.\nWhy it matters #Scope creep — the uncontrolled expansion of requirements — is among the leading causes of IT project failure. Every feature added without reassessing timeline and budget erodes available resources. An effective PM knows how to say \u0026ldquo;yes, and to include this we need to remove that\u0026rdquo; — not simply \u0026ldquo;no.\u0026rdquo;\nWhen to use it #In every project phase: in planning to define boundaries, during execution to evaluate change requests, in stakeholder negotiations to redirect expectations. Clarity on scope is the foundation of every project decision.\n","date":"1 January 0001","permalink":"https://ivanluminaria.com/en/glossary/scope/","section":"Glossary","summary":"\u003cp\u003eA project\u0026rsquo;s \u003cstrong\u003eScope\u003c/strong\u003e defines the perimeter of what the project must deliver: included features, expected deliverables, constraints, and boundaries agreed with stakeholders. Everything inside the scope gets done; everything outside does not.\u003c/p\u003e\n\u003ch2 id=\"how-it-works\" class=\"relative group\"\u003eHow it works \u003cspan class=\"absolute top-0 w-6 transition-opacity opacity-0 -start-6 not-prose group-hover:opacity-100\"\u003e\u003ca class=\"group-hover:text-primary-300 dark:group-hover:text-neutral-700\" style=\"text-decoration-line: none !important;\" href=\"#how-it-works\" aria-label=\"Anchor\"\u003e#\u003c/a\u003e\u003c/span\u003e\u003c/h2\u003e\u003cp\u003eScope is defined in the early project phases through documents like the Statement of Work or Project Charter. Any subsequent change request must go through a formal change management process to evaluate its impact on timeline, budget, and resources.\u003c/p\u003e","title":"Scope"},{"content":"Scope Creep is the progressive and often uncontrolled expansion of a project\u0026rsquo;s scope beyond what was initially defined. New requirements, specification changes and additional features accumulate without a corresponding adjustment to budget and timelines.\nHow it works #In a software project, scope creep typically starts with seemingly small requests: \u0026ldquo;let\u0026rsquo;s add this field too\u0026rdquo;, \u0026ldquo;it would be useful to have this function as well\u0026rdquo;. Each individual change seems reasonable, but the cumulative effect is devastating. Specifications become a moving target, the team can never reach a stable baseline, and the project enters an endless cycle of revisions.\nWhat it\u0026rsquo;s for #Recognising scope creep is the first step to preventing it. Defence mechanisms include: formal change requests with impact analysis, specification freezes per phase, rigorous prioritisation and the ability to say \u0026ldquo;no\u0026rdquo; — or at least \u0026ldquo;not now\u0026rdquo;.\nWhen to use it #The term describes a project management anti-pattern to avoid. In large consulting projects, scope creep often becomes a weapon in the supplier\u0026rsquo;s hands: incomplete specifications justify delays and additional costs, turning a fixed-scope project into an open-ended engagement.\n","date":"1 January 0001","permalink":"https://ivanluminaria.com/en/glossary/scope-creep/","section":"Glossary","summary":"\u003cp\u003e\u003cstrong\u003eScope Creep\u003c/strong\u003e is the progressive and often uncontrolled expansion of a project\u0026rsquo;s scope beyond what was initially defined. New requirements, specification changes and additional features accumulate without a corresponding adjustment to budget and timelines.\u003c/p\u003e\n\u003ch2 id=\"how-it-works\" class=\"relative group\"\u003eHow it works \u003cspan class=\"absolute top-0 w-6 transition-opacity opacity-0 -start-6 not-prose group-hover:opacity-100\"\u003e\u003ca class=\"group-hover:text-primary-300 dark:group-hover:text-neutral-700\" style=\"text-decoration-line: none !important;\" href=\"#how-it-works\" aria-label=\"Anchor\"\u003e#\u003c/a\u003e\u003c/span\u003e\u003c/h2\u003e\u003cp\u003eIn a software project, scope creep typically starts with seemingly small requests: \u0026ldquo;let\u0026rsquo;s add this field too\u0026rdquo;, \u0026ldquo;it would be useful to have this function as well\u0026rdquo;. Each individual change seems reasonable, but the cumulative effect is devastating. Specifications become a moving target, the team can never reach a stable baseline, and the project enters an endless cycle of revisions.\u003c/p\u003e","title":"Scope Creep"},{"content":"Scrum is an agile framework for project management that organizes work into fixed-length iterations called sprints (typically 2 weeks). It defines three roles (Product Owner, Scrum Master, Development Team) and four ceremonies (Sprint Planning, Daily Standup, Sprint Review, Sprint Retrospective).\nHow it works #Each sprint starts with planning, continues with daily standups for synchronization, and ends with a review (what was done) and a retrospective (how to improve the process). Timeboxing is the fundamental principle: every ceremony has a non-negotiable maximum duration.\nWhat it\u0026rsquo;s for #Scrum provides structure for teams working on complex projects with evolving requirements. The short sprint cycle enables rapid feedback, frequent course corrections, and continuous visibility into project status. The daily standup is one of the framework\u0026rsquo;s most recognizable ceremonies.\nWhat can go wrong #The most common risk is adopting Scrum as a ritual without understanding its principles. Teams that hold standups but don\u0026rsquo;t respect the timebox, sprints without a clear goal, retrospectives that don\u0026rsquo;t produce concrete actions. Scrum works when applied with discipline, not when it\u0026rsquo;s just a label.\n","date":"1 January 0001","permalink":"https://ivanluminaria.com/en/glossary/scrum/","section":"Glossary","summary":"\u003cp\u003e\u003cstrong\u003eScrum\u003c/strong\u003e is an agile framework for project management that organizes work into fixed-length iterations called sprints (typically 2 weeks). It defines three roles (Product Owner, Scrum Master, Development Team) and four ceremonies (Sprint Planning, Daily Standup, Sprint Review, Sprint Retrospective).\u003c/p\u003e\n\u003ch2 id=\"how-it-works\" class=\"relative group\"\u003eHow it works \u003cspan class=\"absolute top-0 w-6 transition-opacity opacity-0 -start-6 not-prose group-hover:opacity-100\"\u003e\u003ca class=\"group-hover:text-primary-300 dark:group-hover:text-neutral-700\" style=\"text-decoration-line: none !important;\" href=\"#how-it-works\" aria-label=\"Anchor\"\u003e#\u003c/a\u003e\u003c/span\u003e\u003c/h2\u003e\u003cp\u003eEach sprint starts with planning, continues with daily standups for synchronization, and ends with a review (what was done) and a retrospective (how to improve the process). Timeboxing is the fundamental principle: every ceremony has a non-negotiable maximum duration.\u003c/p\u003e","title":"Scrum"},{"content":"secure-file-priv is a MySQL system variable that controls where LOAD DATA INFILE, SELECT INTO OUTFILE and the LOAD_FILE() function can operate on the server\u0026rsquo;s filesystem.\nHow it works #The variable accepts three values: a specific path (e.g. /var/lib/mysql-files/), which limits file operations to that directory; an empty string (\u0026quot;\u0026quot;), which imposes no restrictions; or NULL, which completely disables file operations. The value can only be set in the configuration file (my.cnf) and requires a service restart to change — it cannot be modified at runtime.\nWhat it\u0026rsquo;s for #The directive prevents arbitrary filesystem access by MySQL users with the FILE privilege. Without this protection, an attacker exploiting SQL injection could read system files (e.g. /etc/passwd, SSH keys) or write web shells into the webroot of a web server on the same host.\nWhen to use it #secure-file-priv should be configured at setup time for every MySQL instance, specifying a dedicated directory. In multi-instance environments, each instance should have its own secure-file-priv directory. If file export is blocked, the recommended alternative is using the mysql command-line client with -B and -e options to redirect output.\n","date":"1 January 0001","permalink":"https://ivanluminaria.com/en/glossary/secure-file-priv/","section":"Glossary","summary":"\u003cp\u003e\u003cstrong\u003esecure-file-priv\u003c/strong\u003e is a MySQL system variable that controls where \u003ccode\u003eLOAD DATA INFILE\u003c/code\u003e, \u003ccode\u003eSELECT INTO OUTFILE\u003c/code\u003e and the \u003ccode\u003eLOAD_FILE()\u003c/code\u003e function can operate on the server\u0026rsquo;s filesystem.\u003c/p\u003e\n\u003ch2 id=\"how-it-works\" class=\"relative group\"\u003eHow it works \u003cspan class=\"absolute top-0 w-6 transition-opacity opacity-0 -start-6 not-prose group-hover:opacity-100\"\u003e\u003ca class=\"group-hover:text-primary-300 dark:group-hover:text-neutral-700\" style=\"text-decoration-line: none !important;\" href=\"#how-it-works\" aria-label=\"Anchor\"\u003e#\u003c/a\u003e\u003c/span\u003e\u003c/h2\u003e\u003cp\u003eThe variable accepts three values: a specific path (e.g. \u003ccode\u003e/var/lib/mysql-files/\u003c/code\u003e), which limits file operations to that directory; an empty string (\u003ccode\u003e\u0026quot;\u0026quot;\u003c/code\u003e), which imposes no restrictions; or \u003ccode\u003eNULL\u003c/code\u003e, which completely disables file operations. The value can only be set in the configuration file (\u003ccode\u003emy.cnf\u003c/code\u003e) and requires a service restart to change — it cannot be modified at runtime.\u003c/p\u003e","title":"secure-file-priv"},{"content":"Self-parenting is a dimensional modeling technique used to balance ragged hierarchies. The principle is simple: an entity that doesn\u0026rsquo;t have an upper hierarchical level becomes its own parent at that level.\nHow it works #In a three-level hierarchy Top Group → Group → Client:\nA Client without a Group uses its own name/ID as its Group A Group without a Top Group uses its own name/ID as its Top Group The result is a dimension table with no NULLs in the hierarchical columns, with all levels always populated.\nDistinction flags #To preserve information about which entities were artificially balanced, flags are added to the dimension:\nis_direct_client = 'Y': the client didn\u0026rsquo;t have a Group in the source is_standalone_group = 'Y': the Group didn\u0026rsquo;t have a Top Group in the source These flags allow the business to filter \u0026ldquo;real\u0026rdquo; top groups from promoted clients.\nWhy in the ETL, not in the report #Self-parenting is applied once in the ETL, not in every single report. A report should do GROUP BY and JOIN, not decide how to handle missing levels. If the balancing logic is in the model, all reports benefit automatically.\n","date":"1 January 0001","permalink":"https://ivanluminaria.com/en/glossary/self-parenting/","section":"Glossary","summary":"\u003cp\u003e\u003cstrong\u003eSelf-parenting\u003c/strong\u003e is a dimensional modeling technique used to balance ragged hierarchies. The principle is simple: an entity that doesn\u0026rsquo;t have an upper hierarchical level becomes its own parent at that level.\u003c/p\u003e\n\u003ch2 id=\"how-it-works\" class=\"relative group\"\u003eHow it works \u003cspan class=\"absolute top-0 w-6 transition-opacity opacity-0 -start-6 not-prose group-hover:opacity-100\"\u003e\u003ca class=\"group-hover:text-primary-300 dark:group-hover:text-neutral-700\" style=\"text-decoration-line: none !important;\" href=\"#how-it-works\" aria-label=\"Anchor\"\u003e#\u003c/a\u003e\u003c/span\u003e\u003c/h2\u003e\u003cp\u003eIn a three-level hierarchy Top Group → Group → Client:\u003c/p\u003e\n\u003cul\u003e\n\u003cli\u003eA Client without a Group uses its own name/ID as its Group\u003c/li\u003e\n\u003cli\u003eA Group without a Top Group uses its own name/ID as its Top Group\u003c/li\u003e\n\u003c/ul\u003e\n\u003cp\u003eThe result is a dimension table with no NULLs in the hierarchical columns, with all levels always populated.\u003c/p\u003e","title":"Self-parenting"},{"content":"A Sequential Scan (Seq Scan) is the operation where PostgreSQL reads a table from start to finish, block by block, without using any index. It\u0026rsquo;s PostgreSQL\u0026rsquo;s equivalent of Oracle\u0026rsquo;s Full Table Scan.\nWhen it\u0026rsquo;s normal #On small tables (a few thousand rows), a sequential scan is often the most efficient choice. Reading an entire table sequentially is faster than index lookups when the table fits in a few pages. The optimizer chooses a sequential scan when it estimates it\u0026rsquo;s cheaper than an index scan.\nWhen it\u0026rsquo;s a problem #On large tables (millions of rows), a sequential scan to return few rows is a red flag. It means an appropriate index is missing or the table statistics are outdated and the optimizer is making wrong estimates. pg_stat_statements helps identify these situations by showing queries with the worst blocks read / rows returned ratio.\nHow to diagnose it #EXPLAIN shows \u0026ldquo;Seq Scan on table\u0026rdquo; in the execution plan. If the subsequent filter discards most rows (rows removed by filter \u0026raquo; rows), an index on the filter column is almost certainly needed.\n","date":"1 January 0001","permalink":"https://ivanluminaria.com/en/glossary/sequential-scan/","section":"Glossary","summary":"\u003cp\u003eA \u003cstrong\u003eSequential Scan\u003c/strong\u003e (Seq Scan) is the operation where PostgreSQL reads a table from start to finish, block by block, without using any index. It\u0026rsquo;s PostgreSQL\u0026rsquo;s equivalent of Oracle\u0026rsquo;s Full Table Scan.\u003c/p\u003e\n\u003ch2 id=\"when-its-normal\" class=\"relative group\"\u003eWhen it\u0026rsquo;s normal \u003cspan class=\"absolute top-0 w-6 transition-opacity opacity-0 -start-6 not-prose group-hover:opacity-100\"\u003e\u003ca class=\"group-hover:text-primary-300 dark:group-hover:text-neutral-700\" style=\"text-decoration-line: none !important;\" href=\"#when-its-normal\" aria-label=\"Anchor\"\u003e#\u003c/a\u003e\u003c/span\u003e\u003c/h2\u003e\u003cp\u003eOn small tables (a few thousand rows), a sequential scan is often the most efficient choice. Reading an entire table sequentially is faster than index lookups when the table fits in a few pages. The optimizer chooses a sequential scan when it estimates it\u0026rsquo;s cheaper than an index scan.\u003c/p\u003e","title":"Sequential Scan"},{"content":"The SGA (System Global Area) is Oracle Database\u0026rsquo;s main shared memory area. It contains fundamental data structures: buffer cache (data pages read from disk), shared pool (execution plans and data dictionary), redo log buffer, and large pool.\nHow it works #SGA size is controlled by the SGA_TARGET or SGA_MAX_SIZE parameter. Oracle allocates the SGA at instance startup in the operating system\u0026rsquo;s shared memory. The Linux kernel parameters shmmax and shmall must be sized to allow complete SGA allocation.\nWhat it\u0026rsquo;s for #All database read and write activity passes through the SGA. An efficient buffer cache avoids physical disk reads. A well-sized shared pool avoids query re-parsing. The SGA is the heart of Oracle performance — and must reside in Huge Pages to maximize efficiency.\nWhy it matters #An SGA not allocated in Huge Pages means millions of Page Table entries and constant TLB overflow. The result is latch free waits, library cache contention, and high CPU. Configuring Huge Pages and the memlock unlimited parameter for the oracle user is the prerequisite for any serious tuning.\n","date":"1 January 0001","permalink":"https://ivanluminaria.com/en/glossary/sga/","section":"Glossary","summary":"\u003cp\u003eThe \u003cstrong\u003eSGA\u003c/strong\u003e (System Global Area) is Oracle Database\u0026rsquo;s main shared memory area. It contains fundamental data structures: buffer cache (data pages read from disk), shared pool (execution plans and data dictionary), redo log buffer, and large pool.\u003c/p\u003e\n\u003ch2 id=\"how-it-works\" class=\"relative group\"\u003eHow it works \u003cspan class=\"absolute top-0 w-6 transition-opacity opacity-0 -start-6 not-prose group-hover:opacity-100\"\u003e\u003ca class=\"group-hover:text-primary-300 dark:group-hover:text-neutral-700\" style=\"text-decoration-line: none !important;\" href=\"#how-it-works\" aria-label=\"Anchor\"\u003e#\u003c/a\u003e\u003c/span\u003e\u003c/h2\u003e\u003cp\u003eSGA size is controlled by the \u003ccode\u003eSGA_TARGET\u003c/code\u003e or \u003ccode\u003eSGA_MAX_SIZE\u003c/code\u003e parameter. Oracle allocates the SGA at instance startup in the operating system\u0026rsquo;s shared memory. The Linux kernel parameters \u003ccode\u003eshmmax\u003c/code\u003e and \u003ccode\u003eshmall\u003c/code\u003e must be sized to allow complete SGA allocation.\u003c/p\u003e","title":"SGA"},{"content":"shared_buffers is the parameter that controls the size of the shared memory area PostgreSQL uses as a cache for data blocks read from disk. Every time PostgreSQL reads a data page (8 KB), it keeps it in shared_buffers for subsequent reads.\nHow it works #PostgreSQL allocates memory for shared_buffers at service startup. All backend processes share this memory area. When a process needs a data block, it first looks in shared_buffers. If it finds it (cache hit), the read is immediate. If not (cache miss), it must read from disk — an operation orders of magnitude slower.\nHow much to allocate #The default value is 128 MB — inadequate for any production database. The rule of thumb is to set shared_buffers to 25% of available RAM. On a server with 64 GB of RAM, 16 GB is a good starting point. Values beyond 40% of RAM rarely bring benefits because PostgreSQL also relies on the operating system\u0026rsquo;s cache.\nHow to monitor it #The pg_stat_bgwriter view shows the ratio between buffers_alloc (newly allocated blocks) and the total blocks served. A cache hit ratio below 95% suggests that shared_buffers may be undersized.\n","date":"1 January 0001","permalink":"https://ivanluminaria.com/en/glossary/shared-buffers/","section":"Glossary","summary":"\u003cp\u003e\u003cstrong\u003eshared_buffers\u003c/strong\u003e is the parameter that controls the size of the shared memory area PostgreSQL uses as a cache for data blocks read from disk. Every time PostgreSQL reads a data page (8 KB), it keeps it in shared_buffers for subsequent reads.\u003c/p\u003e\n\u003ch2 id=\"how-it-works\" class=\"relative group\"\u003eHow it works \u003cspan class=\"absolute top-0 w-6 transition-opacity opacity-0 -start-6 not-prose group-hover:opacity-100\"\u003e\u003ca class=\"group-hover:text-primary-300 dark:group-hover:text-neutral-700\" style=\"text-decoration-line: none !important;\" href=\"#how-it-works\" aria-label=\"Anchor\"\u003e#\u003c/a\u003e\u003c/span\u003e\u003c/h2\u003e\u003cp\u003ePostgreSQL allocates memory for shared_buffers at service startup. All backend processes share this memory area. When a process needs a data block, it first looks in shared_buffers. If it finds it (cache hit), the read is immediate. If not (cache miss), it must read from disk — an operation orders of magnitude slower.\u003c/p\u003e","title":"shared_buffers"},{"content":"Single-primary is the most common operating mode in MySQL Group Replication, where only one node in the cluster — the primary — accepts write operations. The other nodes (secondaries) are read-only (read_only=ON, super_read_only=ON) and receive changes through the group\u0026rsquo;s synchronous replication.\nHow it works #The parameter group_replication_single_primary_mode=ON enables this mode. The primary is the only node with read_only=OFF. If the primary is stopped or becomes unreachable, the cluster triggers an automatic election and one of the secondaries becomes the new primary within seconds.\nWhy it\u0026rsquo;s used #Single-primary mode avoids the concurrent write conflicts typical of multi-primary setups. In production, most MySQL clusters use this mode because it\u0026rsquo;s more predictable: applications write to a single endpoint, replication is linear, and debugging is simpler.\nWhat can go wrong #When the primary is stopped for maintenance, the cluster performs an automatic failover. During those seconds, active connections may be dropped and in-flight transactions may fail. It\u0026rsquo;s a brief disruption but it must be communicated. The practical rule: in a maintenance intervention on a single-primary cluster, secondaries are touched first, the primary last.\n","date":"1 January 0001","permalink":"https://ivanluminaria.com/en/glossary/single-primary/","section":"Glossary","summary":"\u003cp\u003e\u003cstrong\u003eSingle-primary\u003c/strong\u003e is the most common operating mode in MySQL Group Replication, where only one node in the cluster — the primary — accepts write operations. The other nodes (secondaries) are read-only (\u003ccode\u003eread_only=ON\u003c/code\u003e, \u003ccode\u003esuper_read_only=ON\u003c/code\u003e) and receive changes through the group\u0026rsquo;s synchronous replication.\u003c/p\u003e\n\u003ch2 id=\"how-it-works\" class=\"relative group\"\u003eHow it works \u003cspan class=\"absolute top-0 w-6 transition-opacity opacity-0 -start-6 not-prose group-hover:opacity-100\"\u003e\u003ca class=\"group-hover:text-primary-300 dark:group-hover:text-neutral-700\" style=\"text-decoration-line: none !important;\" href=\"#how-it-works\" aria-label=\"Anchor\"\u003e#\u003c/a\u003e\u003c/span\u003e\u003c/h2\u003e\u003cp\u003eThe parameter \u003ccode\u003egroup_replication_single_primary_mode=ON\u003c/code\u003e enables this mode. The primary is the only node with \u003ccode\u003eread_only=OFF\u003c/code\u003e. If the primary is stopped or becomes unreachable, the cluster triggers an automatic election and one of the secondaries becomes the new primary within seconds.\u003c/p\u003e","title":"Single-primary"},{"content":"Smart Working (agile working) is an organizational model that allows employees to work from any location, combining office days and remote days, with flexible hours and evaluation based on results instead of presence.\nHow it works #The typical model is 80/20: 80% remote, 20% in person. Office days are for workshops, project reviews, and team building — not for warming chairs. The company provides equipment for the home workstation (monitor, ergonomic chair, headset) and a connectivity allowance.\nWhat it\u0026rsquo;s for #In IT consulting, smart working eliminates commuting costs (up to 90 hours/month for a Roman consultant), reduces real estate costs (from 50 fixed workstations to 15 hot desks), and restores hours of real productivity. A consultant who starts working at 7:45 fresh and focused produces more than one who arrives at 9:30 stressed from traffic.\nWhy it matters #Smart working is not a perk — it is an organizational model that requires clear KPIs, mutual trust, and adequate communication tools. Companies that adopt it as a \u0026ldquo;concession\u0026rdquo; instead of a \u0026ldquo;strategy\u0026rdquo; lose its benefits. Those that reject it due to presenteeism pay an invisible cost in lost productivity and fleeing talent.\n","date":"1 January 0001","permalink":"https://ivanluminaria.com/en/glossary/smart-working/","section":"Glossary","summary":"\u003cp\u003e\u003cstrong\u003eSmart Working\u003c/strong\u003e (agile working) is an organizational model that allows employees to work from any location, combining office days and remote days, with flexible hours and evaluation based on results instead of presence.\u003c/p\u003e\n\u003ch2 id=\"how-it-works\" class=\"relative group\"\u003eHow it works \u003cspan class=\"absolute top-0 w-6 transition-opacity opacity-0 -start-6 not-prose group-hover:opacity-100\"\u003e\u003ca class=\"group-hover:text-primary-300 dark:group-hover:text-neutral-700\" style=\"text-decoration-line: none !important;\" href=\"#how-it-works\" aria-label=\"Anchor\"\u003e#\u003c/a\u003e\u003c/span\u003e\u003c/h2\u003e\u003cp\u003eThe typical model is 80/20: 80% remote, 20% in person. Office days are for workshops, project reviews, and team building — not for warming chairs. The company provides equipment for the home workstation (monitor, ergonomic chair, headset) and a connectivity allowance.\u003c/p\u003e","title":"Smart Working"},{"content":"Snapshot in Oracle is a point-in-time capture of database performance statistics stored in the AWR repository. By default Oracle generates a snapshot every 60 minutes and retains them for 8 days.\nHow it works #Each snapshot records hundreds of metrics: wait events, SQL statistics, memory metrics (SGA, PGA), I/O by datafile, system statistics. Comparing two snapshots generates the AWR report, which shows what changed between the two points in time.\nManual snapshots #In emergency situations you can generate a manual snapshot to capture the current state:\nEXEC DBMS_WORKLOAD_REPOSITORY.create_snapshot; This is useful when you want an immediate reference point — for example, before and after a deploy — without waiting for the automatic cycle.\nManagement #Snapshots are accessible through the DBA_HIST_SNAPSHOT view. Retention (how many days to keep them) and interval (how often to generate them) are configured with:\nEXEC DBMS_WORKLOAD_REPOSITORY.modify_snapshot_settings( retention =\u0026gt; 43200, -- 30 days in minutes interval =\u0026gt; 30 -- every 30 minutes ); Why they matter #Without snapshots, there is no AWR. Without AWR, performance diagnosis becomes guesswork instead of data-driven analysis. Snapshots are the foundation of observability in Oracle.\n","date":"1 January 0001","permalink":"https://ivanluminaria.com/en/glossary/snapshot-oracle/","section":"Glossary","summary":"\u003cp\u003e\u003cstrong\u003eSnapshot\u003c/strong\u003e in Oracle is a point-in-time capture of database performance statistics stored in the AWR repository. By default Oracle generates a snapshot every 60 minutes and retains them for 8 days.\u003c/p\u003e\n\u003ch2 id=\"how-it-works\" class=\"relative group\"\u003eHow it works \u003cspan class=\"absolute top-0 w-6 transition-opacity opacity-0 -start-6 not-prose group-hover:opacity-100\"\u003e\u003ca class=\"group-hover:text-primary-300 dark:group-hover:text-neutral-700\" style=\"text-decoration-line: none !important;\" href=\"#how-it-works\" aria-label=\"Anchor\"\u003e#\u003c/a\u003e\u003c/span\u003e\u003c/h2\u003e\u003cp\u003eEach snapshot records hundreds of metrics: wait events, SQL statistics, memory metrics (SGA, PGA), I/O by datafile, system statistics. Comparing two snapshots generates the AWR report, which shows what changed between the two points in time.\u003c/p\u003e","title":"Snapshot (Oracle)"},{"content":"Split-brain is a critical condition that occurs when a database cluster splits into two or more partitions that cannot communicate with each other, and each partition continues accepting writes independently. The result is divergent data impossible to reconcile automatically.\nHow it works #In a 3-node cluster, if the network between Node 1 and Nodes 2-3 breaks, without quorum protection both parts could continue accepting writes. When the network is restored, the cluster would find itself with two different versions of the same data. The quorum mechanism prevents this scenario: only the partition with the majority of nodes (quorum) can continue operating.\nWhat it\u0026rsquo;s for #Understanding split-brain is fundamental for designing reliable database clusters. It\u0026rsquo;s the main reason Galera Cluster requires an odd number of nodes (3, 5, 7) and implements the quorum mechanism. With an even number of nodes, a network partition can split the cluster into two equal halves, neither of which has quorum.\nWhen to use it #The term split-brain describes a risk to avoid, not a feature to enable. In Galera, protection is automatic: nodes that lose quorum transition to Non-Primary state and refuse writes. The pc.ignore_quorum parameter disables this protection, but using it in production is strongly discouraged.\n","date":"1 January 0001","permalink":"https://ivanluminaria.com/en/glossary/split-brain/","section":"Glossary","summary":"\u003cp\u003e\u003cstrong\u003eSplit-brain\u003c/strong\u003e is a critical condition that occurs when a database cluster splits into two or more partitions that cannot communicate with each other, and each partition continues accepting writes independently. The result is divergent data impossible to reconcile automatically.\u003c/p\u003e\n\u003ch2 id=\"how-it-works\" class=\"relative group\"\u003eHow it works \u003cspan class=\"absolute top-0 w-6 transition-opacity opacity-0 -start-6 not-prose group-hover:opacity-100\"\u003e\u003ca class=\"group-hover:text-primary-300 dark:group-hover:text-neutral-700\" style=\"text-decoration-line: none !important;\" href=\"#how-it-works\" aria-label=\"Anchor\"\u003e#\u003c/a\u003e\u003c/span\u003e\u003c/h2\u003e\u003cp\u003eIn a 3-node cluster, if the network between Node 1 and Nodes 2-3 breaks, without quorum protection both parts could continue accepting writes. When the network is restored, the cluster would find itself with two different versions of the same data. The quorum mechanism prevents this scenario: only the partition with the majority of nodes (quorum) can continue operating.\u003c/p\u003e","title":"Split-brain"},{"content":"SQL Injection is one of the most widespread and dangerous vulnerabilities in web applications. It occurs when user-supplied input is inserted directly into SQL queries without validation or parameterisation, allowing an attacker to modify the query logic.\nHow it works #The attacker inserts SQL code fragments into application input fields (login forms, search fields, URL parameters). If the application concatenates these inputs directly into SQL queries, the malicious code is executed by the database with the application user\u0026rsquo;s privileges. Combined with MySQL\u0026rsquo;s FILE privilege and an unconfigured secure-file-priv, the attacker can read system files or write arbitrary files on the server.\nWhat it\u0026rsquo;s for #Understanding SQL injection is fundamental for anyone managing databases in production, because many security configurations (such as secure-file-priv, privilege management and user separation) exist specifically to mitigate the impact of this type of attack.\nWhen to use it #The term describes an attack to prevent, not a technique to use. The main countermeasures are: parameterised queries (prepared statements), input validation, principle of least privilege for database users, and correct configuration of directives like secure-file-priv.\n","date":"1 January 0001","permalink":"https://ivanluminaria.com/en/glossary/sql-injection/","section":"Glossary","summary":"\u003cp\u003e\u003cstrong\u003eSQL Injection\u003c/strong\u003e is one of the most widespread and dangerous vulnerabilities in web applications. It occurs when user-supplied input is inserted directly into SQL queries without validation or parameterisation, allowing an attacker to modify the query logic.\u003c/p\u003e\n\u003ch2 id=\"how-it-works\" class=\"relative group\"\u003eHow it works \u003cspan class=\"absolute top-0 w-6 transition-opacity opacity-0 -start-6 not-prose group-hover:opacity-100\"\u003e\u003ca class=\"group-hover:text-primary-300 dark:group-hover:text-neutral-700\" style=\"text-decoration-line: none !important;\" href=\"#how-it-works\" aria-label=\"Anchor\"\u003e#\u003c/a\u003e\u003c/span\u003e\u003c/h2\u003e\u003cp\u003eThe attacker inserts SQL code fragments into application input fields (login forms, search fields, URL parameters). If the application concatenates these inputs directly into SQL queries, the malicious code is executed by the database with the application user\u0026rsquo;s privileges. Combined with MySQL\u0026rsquo;s \u003ccode\u003eFILE\u003c/code\u003e privilege and an unconfigured \u003ccode\u003esecure-file-priv\u003c/code\u003e, the attacker can read system files or write arbitrary files on the server.\u003c/p\u003e","title":"SQL Injection"},{"content":"SST (State Snapshot Transfer) is the mechanism by which a Galera node joining the cluster (or one that has been offline too long) receives a complete copy of the entire dataset from a donor node.\nHow it works #When a node joins the cluster and the gap of missing transactions exceeds the gcache size, the cluster initiates an SST. The donor node creates a full snapshot of the database and transfers it to the receiving node. Available methods include: mariabackup (doesn\u0026rsquo;t block the donor), rsync (fast but blocks the donor for reads), and mysqldump (slow and blocking).\nWhat it\u0026rsquo;s for #SST is essential for two scenarios: adding a new node to the cluster (first join) and recovering a node that has been offline so long that missing transactions are no longer available in the donor\u0026rsquo;s gcache.\nWhen to use it #SST is triggered automatically by Galera when needed. The SST method choice (wsrep_sst_method) is made during configuration. In production, mariabackup is the recommended choice because it doesn\u0026rsquo;t block the donor node, avoiding cluster degradation during the transfer.\n","date":"1 January 0001","permalink":"https://ivanluminaria.com/en/glossary/sst/","section":"Glossary","summary":"\u003cp\u003e\u003cstrong\u003eSST\u003c/strong\u003e (State Snapshot Transfer) is the mechanism by which a Galera node joining the cluster (or one that has been offline too long) receives a complete copy of the entire dataset from a donor node.\u003c/p\u003e\n\u003ch2 id=\"how-it-works\" class=\"relative group\"\u003eHow it works \u003cspan class=\"absolute top-0 w-6 transition-opacity opacity-0 -start-6 not-prose group-hover:opacity-100\"\u003e\u003ca class=\"group-hover:text-primary-300 dark:group-hover:text-neutral-700\" style=\"text-decoration-line: none !important;\" href=\"#how-it-works\" aria-label=\"Anchor\"\u003e#\u003c/a\u003e\u003c/span\u003e\u003c/h2\u003e\u003cp\u003eWhen a node joins the cluster and the gap of missing transactions exceeds the gcache size, the cluster initiates an SST. The donor node creates a full snapshot of the database and transfers it to the receiving node. Available methods include: \u003ccode\u003emariabackup\u003c/code\u003e (doesn\u0026rsquo;t block the donor), \u003ccode\u003ersync\u003c/code\u003e (fast but blocks the donor for reads), and \u003ccode\u003emysqldump\u003c/code\u003e (slow and blocking).\u003c/p\u003e","title":"SST"},{"content":"A Stakeholder is any person, group, or organization with a direct or indirect interest in a project\u0026rsquo;s outcome. This includes clients, sponsors, end users, development teams, management, and external vendors.\nHow it works #In project management, stakeholders are identified, classified by level of influence and interest, and managed with differentiated communication strategies. A high-influence, high-interest stakeholder (like the CTO) requires active involvement; a low-influence one requires only periodic updates.\nWhy it matters #Most project failures are not technical — they are relational. Misaligned stakeholders, unmanaged expectations, and poor communication are the most frequent causes of delays, scope creep, and conflicts. An effective PM spends more time managing stakeholders than managing timelines.\nWhen to use it #In every project phase: in requirements definition (who decides what gets built), in planning (who approves resources), in execution (who validates deliverables), and in closure (who accepts the result). Ignoring a key stakeholder is the fastest way to derail a project.\n","date":"1 January 0001","permalink":"https://ivanluminaria.com/en/glossary/stakeholder/","section":"Glossary","summary":"\u003cp\u003eA \u003cstrong\u003eStakeholder\u003c/strong\u003e is any person, group, or organization with a direct or indirect interest in a project\u0026rsquo;s outcome. This includes clients, sponsors, end users, development teams, management, and external vendors.\u003c/p\u003e\n\u003ch2 id=\"how-it-works\" class=\"relative group\"\u003eHow it works \u003cspan class=\"absolute top-0 w-6 transition-opacity opacity-0 -start-6 not-prose group-hover:opacity-100\"\u003e\u003ca class=\"group-hover:text-primary-300 dark:group-hover:text-neutral-700\" style=\"text-decoration-line: none !important;\" href=\"#how-it-works\" aria-label=\"Anchor\"\u003e#\u003c/a\u003e\u003c/span\u003e\u003c/h2\u003e\u003cp\u003eIn project management, stakeholders are identified, classified by level of influence and interest, and managed with differentiated communication strategies. A high-influence, high-interest stakeholder (like the CTO) requires active involvement; a low-influence one requires only periodic updates.\u003c/p\u003e","title":"Stakeholder"},{"content":"A star schema is the most widely used data model in data warehousing. It gets its name from its shape: a central fact table connected to multiple dimension tables surrounding it, like the points of a star.\nStructure # Fact table at the center: contains numeric measures and foreign keys to dimensions Dimension tables around it: contain descriptive attributes (who, what, where, when) in a denormalized structure Dimensions in a star schema are typically denormalized — all attributes in a single flat table, with no normalized hierarchies. This simplifies queries and improves aggregation performance.\nWhy it works #The star schema is optimized for analytical queries:\nJoins are simple: the fact connects directly to each dimension with a single join Aggregations are fast: database optimizers recognize the pattern and optimize for it It\u0026rsquo;s intuitive for business users: the structure mirrors how they think about data (sales by product, by region, by period) Star schema vs Snowflake #A snowflake schema normalizes the dimensions, splitting them into sub-tables. It saves space but complicates queries with additional joins. In practice, star schemas are preferred in most cases because the simplicity of queries far outweighs the cost of extra space in dimensions.\n","date":"1 January 0001","permalink":"https://ivanluminaria.com/en/glossary/star-schema/","section":"Glossary","summary":"\u003cp\u003eA \u003cstrong\u003estar schema\u003c/strong\u003e is the most widely used data model in data warehousing. It gets its name from its shape: a central fact table connected to multiple dimension tables surrounding it, like the points of a star.\u003c/p\u003e\n\u003ch2 id=\"structure\" class=\"relative group\"\u003eStructure \u003cspan class=\"absolute top-0 w-6 transition-opacity opacity-0 -start-6 not-prose group-hover:opacity-100\"\u003e\u003ca class=\"group-hover:text-primary-300 dark:group-hover:text-neutral-700\" style=\"text-decoration-line: none !important;\" href=\"#structure\" aria-label=\"Anchor\"\u003e#\u003c/a\u003e\u003c/span\u003e\u003c/h2\u003e\u003cul\u003e\n\u003cli\u003e\u003cstrong\u003eFact table\u003c/strong\u003e at the center: contains numeric measures and foreign keys to dimensions\u003c/li\u003e\n\u003cli\u003e\u003cstrong\u003eDimension tables\u003c/strong\u003e around it: contain descriptive attributes (who, what, where, when) in a denormalized structure\u003c/li\u003e\n\u003c/ul\u003e\n\u003cp\u003eDimensions in a star schema are typically denormalized — all attributes in a single flat table, with no normalized hierarchies. This simplifies queries and improves aggregation performance.\u003c/p\u003e","title":"Star schema"},{"content":"A surrogate key is a sequential numeric identifier generated internally by the data warehouse, with no business meaning. It is distinct from the natural key — the one coming from the source system (e.g. customer code, employee number).\nWhy it matters #In SCD Type 2, the same customer can have multiple rows in the dimension table — one for each historical version. The natural key (customer_id) is no longer unique, so you need an identifier that distinguishes each individual version: the surrogate key (customer_key).\nHow it works #It\u0026rsquo;s typically generated by a sequence (Oracle) or a SERIAL/IDENTITY column (PostgreSQL, MySQL). It\u0026rsquo;s never exposed to end users and has no meaning outside the data warehouse.\nThe fact table uses the surrogate key as a foreign key, pointing to the specific dimension version that was valid at the time of the fact. This ensures every transaction is associated with the correct dimensional context for that point in time.\nAdvantages # Enables dimension versioning (SCD Type 2) Joins between fact and dimension are on integers, so they\u0026rsquo;re fast Insulates the DWH from changes in source system keys Supports loading from multiple sources with potentially duplicate natural keys ","date":"1 January 0001","permalink":"https://ivanluminaria.com/en/glossary/chiave-surrogata/","section":"Glossary","summary":"\u003cp\u003eA \u003cstrong\u003esurrogate key\u003c/strong\u003e is a sequential numeric identifier generated internally by the data warehouse, with no business meaning. It is distinct from the natural key — the one coming from the source system (e.g. customer code, employee number).\u003c/p\u003e\n\u003ch2 id=\"why-it-matters\" class=\"relative group\"\u003eWhy it matters \u003cspan class=\"absolute top-0 w-6 transition-opacity opacity-0 -start-6 not-prose group-hover:opacity-100\"\u003e\u003ca class=\"group-hover:text-primary-300 dark:group-hover:text-neutral-700\" style=\"text-decoration-line: none !important;\" href=\"#why-it-matters\" aria-label=\"Anchor\"\u003e#\u003c/a\u003e\u003c/span\u003e\u003c/h2\u003e\u003cp\u003eIn SCD Type 2, the same customer can have multiple rows in the dimension table — one for each historical version. The natural key (\u003ccode\u003ecustomer_id\u003c/code\u003e) is no longer unique, so you need an identifier that distinguishes each individual version: the surrogate key (\u003ccode\u003ecustomer_key\u003c/code\u003e).\u003c/p\u003e","title":"Surrogate key"},{"content":"Sustainable Mobility is an approach to urban transport that favors the use of low environmental impact means — cycling, public transport, electric vehicles, car sharing — over private combustion-engine cars.\nHow it works #It is based on a paradigm shift: instead of building more roads for more cars, invest in cycling infrastructure, efficient public transport, and incentives for active mobility. Cities like Amsterdam, Copenhagen, and Munich demonstrate that the model works at scale.\nWhat it\u0026rsquo;s for #It reduces emissions, traffic, individual and collective costs, noise pollution, and land use. For the individual commuter, it means saving up to €5,450 per year and 455 hours of time compared to driving — two and a half months of life returned.\nWhy it matters #Rome has 300 sunny days per year, a mild climate, and contained urban distances. It is paradoxically one of the Italian cities best suited for cycling. What\u0026rsquo;s missing is infrastructure and the courage to change habits. The ideal model combines smart working (3 remote days) with sustainable mobility (2 bike days): from 13 weekly hours commuting to 1 hour and 12 minutes.\n","date":"1 January 0001","permalink":"https://ivanluminaria.com/en/glossary/mobilita-sostenibile/","section":"Glossary","summary":"\u003cp\u003e\u003cstrong\u003eSustainable Mobility\u003c/strong\u003e is an approach to urban transport that favors the use of low environmental impact means — cycling, public transport, electric vehicles, car sharing — over private combustion-engine cars.\u003c/p\u003e\n\u003ch2 id=\"how-it-works\" class=\"relative group\"\u003eHow it works \u003cspan class=\"absolute top-0 w-6 transition-opacity opacity-0 -start-6 not-prose group-hover:opacity-100\"\u003e\u003ca class=\"group-hover:text-primary-300 dark:group-hover:text-neutral-700\" style=\"text-decoration-line: none !important;\" href=\"#how-it-works\" aria-label=\"Anchor\"\u003e#\u003c/a\u003e\u003c/span\u003e\u003c/h2\u003e\u003cp\u003eIt is based on a paradigm shift: instead of building more roads for more cars, invest in cycling infrastructure, efficient public transport, and incentives for active mobility. Cities like Amsterdam, Copenhagen, and Munich demonstrate that the model works at scale.\u003c/p\u003e","title":"Sustainable Mobility"},{"content":"Swappiness (vm.swappiness) is a Linux kernel parameter that controls how aggressively the system moves memory pages from RAM to swap on disk. The value ranges from 0 (swap only in extreme cases) to 100 (aggressive swapping). The default is 60.\nHow it works #With the default value of 60, Linux starts swapping when memory pressure is still relatively low. For a dedicated database server, this is unacceptable: the SGA must stay in RAM, always. The recommended value for Oracle is 1 — not 0, which would completely disable swap and could trigger the OOM killer.\nWhat it\u0026rsquo;s for #The value 1 tells the kernel: \u0026ldquo;Swap only if there is truly no alternative.\u0026rdquo; This ensures the SGA and Oracle\u0026rsquo;s critical structures remain in physical memory, avoiding swap reads (orders of magnitude slower than RAM) during query execution.\nWhy it matters #With swappiness at 60, a server with 128 GB of RAM and a 64 GB SGA may start swapping parts of the SGA even with 20-30 GB of free RAM. The result is unpredictably degraded performance, with latency spikes that look like application problems but are actually the OS moving memory to disk.\n","date":"1 January 0001","permalink":"https://ivanluminaria.com/en/glossary/swappiness/","section":"Glossary","summary":"\u003cp\u003e\u003cstrong\u003eSwappiness\u003c/strong\u003e (\u003ccode\u003evm.swappiness\u003c/code\u003e) is a Linux kernel parameter that controls how aggressively the system moves memory pages from RAM to swap on disk. The value ranges from 0 (swap only in extreme cases) to 100 (aggressive swapping). The default is 60.\u003c/p\u003e\n\u003ch2 id=\"how-it-works\" class=\"relative group\"\u003eHow it works \u003cspan class=\"absolute top-0 w-6 transition-opacity opacity-0 -start-6 not-prose group-hover:opacity-100\"\u003e\u003ca class=\"group-hover:text-primary-300 dark:group-hover:text-neutral-700\" style=\"text-decoration-line: none !important;\" href=\"#how-it-works\" aria-label=\"Anchor\"\u003e#\u003c/a\u003e\u003c/span\u003e\u003c/h2\u003e\u003cp\u003eWith the default value of 60, Linux starts swapping when memory pressure is still relatively low. For a dedicated database server, this is unacceptable: the SGA must stay in RAM, always. The recommended value for Oracle is 1 — not 0, which would completely disable swap and could trigger the OOM killer.\u003c/p\u003e","title":"Swappiness"},{"content":"A switchover is a planned Oracle Data Guard operation that reverses the roles between the primary and standby databases. The primary becomes the standby, the standby becomes the primary. No data is lost, no transaction fails — it\u0026rsquo;s a clean, controlled transition.\nSwitchover vs Failover #The distinction is fundamental:\nSwitchover Failover When Planned (maintenance, migration) Emergency (primary failure) Data loss Zero Possible (depends on mode) Reversibility Yes, with another switchover No, standby becomes primary permanently Time Minutes (typically 1-3) Seconds to minutes How to execute #With Data Guard Broker, the switchover is a single command:\nDGMGRL\u0026gt; SWITCHOVER TO standby_db; The broker automatically manages the sequence: stopping redo transport, applying the last redo on the standby, reversing roles, restarting redo transport in the opposite direction.\nUse in migrations #Switchover is the preferred strategy for Oracle cross-site migrations. You configure Data Guard between the source and target environments, let it synchronize, and at cutover time you execute the switchover. If something goes wrong on the new infrastructure, a second switchover brings everything back to the starting point — a safety net that Data Pump cannot offer.\n","date":"1 January 0001","permalink":"https://ivanluminaria.com/en/glossary/switchover/","section":"Glossary","summary":"\u003cp\u003eA \u003cstrong\u003eswitchover\u003c/strong\u003e is a planned Oracle Data Guard operation that reverses the roles between the primary and standby databases. The primary becomes the standby, the standby becomes the primary. No data is lost, no transaction fails — it\u0026rsquo;s a clean, controlled transition.\u003c/p\u003e\n\u003ch2 id=\"switchover-vs-failover\" class=\"relative group\"\u003eSwitchover vs Failover \u003cspan class=\"absolute top-0 w-6 transition-opacity opacity-0 -start-6 not-prose group-hover:opacity-100\"\u003e\u003ca class=\"group-hover:text-primary-300 dark:group-hover:text-neutral-700\" style=\"text-decoration-line: none !important;\" href=\"#switchover-vs-failover\" aria-label=\"Anchor\"\u003e#\u003c/a\u003e\u003c/span\u003e\u003c/h2\u003e\u003cp\u003eThe distinction is fundamental:\u003c/p\u003e\n\u003ctable\u003e\n  \u003cthead\u003e\n      \u003ctr\u003e\n          \u003cth\u003e\u003c/th\u003e\n          \u003cth\u003eSwitchover\u003c/th\u003e\n          \u003cth\u003eFailover\u003c/th\u003e\n      \u003c/tr\u003e\n  \u003c/thead\u003e\n  \u003ctbody\u003e\n      \u003ctr\u003e\n          \u003ctd\u003e\u003cstrong\u003eWhen\u003c/strong\u003e\u003c/td\u003e\n          \u003ctd\u003ePlanned (maintenance, migration)\u003c/td\u003e\n          \u003ctd\u003eEmergency (primary failure)\u003c/td\u003e\n      \u003c/tr\u003e\n      \u003ctr\u003e\n          \u003ctd\u003e\u003cstrong\u003eData loss\u003c/strong\u003e\u003c/td\u003e\n          \u003ctd\u003eZero\u003c/td\u003e\n          \u003ctd\u003ePossible (depends on mode)\u003c/td\u003e\n      \u003c/tr\u003e\n      \u003ctr\u003e\n          \u003ctd\u003e\u003cstrong\u003eReversibility\u003c/strong\u003e\u003c/td\u003e\n          \u003ctd\u003eYes, with another switchover\u003c/td\u003e\n          \u003ctd\u003eNo, standby becomes primary permanently\u003c/td\u003e\n      \u003c/tr\u003e\n      \u003ctr\u003e\n          \u003ctd\u003e\u003cstrong\u003eTime\u003c/strong\u003e\u003c/td\u003e\n          \u003ctd\u003eMinutes (typically 1-3)\u003c/td\u003e\n          \u003ctd\u003eSeconds to minutes\u003c/td\u003e\n      \u003c/tr\u003e\n  \u003c/tbody\u003e\n\u003c/table\u003e\n\u003ch2 id=\"how-to-execute\" class=\"relative group\"\u003eHow to execute \u003cspan class=\"absolute top-0 w-6 transition-opacity opacity-0 -start-6 not-prose group-hover:opacity-100\"\u003e\u003ca class=\"group-hover:text-primary-300 dark:group-hover:text-neutral-700\" style=\"text-decoration-line: none !important;\" href=\"#how-to-execute\" aria-label=\"Anchor\"\u003e#\u003c/a\u003e\u003c/span\u003e\u003c/h2\u003e\u003cp\u003eWith Data Guard Broker, the switchover is a single command:\u003c/p\u003e","title":"Switchover"},{"content":"A System Privilege in Oracle is an authorization that allows performing global database operations, independent of any specific object. Typical examples include CREATE TABLE, CREATE SESSION, ALTER SYSTEM, CREATE USER, and DROP ANY TABLE.\nHow it works #System privileges are granted with GRANT and revoked with REVOKE. They can be assigned directly to a user or to a role. The predefined DBA role includes over 200 system privileges, which is why granting it to application users is a dangerous practice.\nWhat it\u0026rsquo;s for #System privileges define what a user can do at the database level: create objects, manage users, modify system parameters. They are the highest level of authorization in Oracle and must be managed with extreme care, following the principle of least privilege.\nWhat can go wrong #A system privilege like DROP ANY TABLE allows deleting any table in any schema. If mistakenly granted to an application user, a single command can destroy production data. The distinction between system privileges and object privileges is fundamental to building a robust security model.\n","date":"1 January 0001","permalink":"https://ivanluminaria.com/en/glossary/system-privilege/","section":"Glossary","summary":"\u003cp\u003eA \u003cstrong\u003eSystem Privilege\u003c/strong\u003e in Oracle is an authorization that allows performing global database operations, independent of any specific object. Typical examples include \u003ccode\u003eCREATE TABLE\u003c/code\u003e, \u003ccode\u003eCREATE SESSION\u003c/code\u003e, \u003ccode\u003eALTER SYSTEM\u003c/code\u003e, \u003ccode\u003eCREATE USER\u003c/code\u003e, and \u003ccode\u003eDROP ANY TABLE\u003c/code\u003e.\u003c/p\u003e\n\u003ch2 id=\"how-it-works\" class=\"relative group\"\u003eHow it works \u003cspan class=\"absolute top-0 w-6 transition-opacity opacity-0 -start-6 not-prose group-hover:opacity-100\"\u003e\u003ca class=\"group-hover:text-primary-300 dark:group-hover:text-neutral-700\" style=\"text-decoration-line: none !important;\" href=\"#how-it-works\" aria-label=\"Anchor\"\u003e#\u003c/a\u003e\u003c/span\u003e\u003c/h2\u003e\u003cp\u003eSystem privileges are granted with \u003ccode\u003eGRANT\u003c/code\u003e and revoked with \u003ccode\u003eREVOKE\u003c/code\u003e. They can be assigned directly to a user or to a role. The predefined \u003ccode\u003eDBA\u003c/code\u003e role includes over 200 system privileges, which is why granting it to application users is a dangerous practice.\u003c/p\u003e","title":"System Privilege"},{"content":"systemd is the default init system and service manager on modern Linux distributions (CentOS/RHEL 7+, Ubuntu 16.04+, Debian 8+). In the database context, it is the mechanism that starts, stops and monitors MySQL or MariaDB instances.\nHow it works #Each service is defined by a unit file (e.g. mysqld.service) that specifies the startup command, configuration file, dependencies and crash behaviour. In a multi-instance setup, separate unit files are created for each instance (e.g. mysqld-app2.service, mysqld-reporting.service), each with its own --defaults-file pointing to a different my.cnf.\nWhat it\u0026rsquo;s for #systemd allows managing MySQL instances as independent services: starting, stopping, restarting and monitoring them separately. The systemctl cat \u0026lt;service\u0026gt; command is essential for tracing from the service name back to the instance\u0026rsquo;s configuration file, and from there to port, socket and datadir.\nWhen to use it #systemd is automatically active on any modern Linux server. In DBA work, you interact with it via systemctl start/stop/status/restart \u0026lt;service\u0026gt;. In multi-instance environments, systemctl list-units --type=service | grep mysql is the first command for identifying how many instances are running on a server.\n","date":"1 January 0001","permalink":"https://ivanluminaria.com/en/glossary/systemd/","section":"Glossary","summary":"\u003cp\u003e\u003cstrong\u003esystemd\u003c/strong\u003e is the default init system and service manager on modern Linux distributions (CentOS/RHEL 7+, Ubuntu 16.04+, Debian 8+). In the database context, it is the mechanism that starts, stops and monitors MySQL or MariaDB instances.\u003c/p\u003e\n\u003ch2 id=\"how-it-works\" class=\"relative group\"\u003eHow it works \u003cspan class=\"absolute top-0 w-6 transition-opacity opacity-0 -start-6 not-prose group-hover:opacity-100\"\u003e\u003ca class=\"group-hover:text-primary-300 dark:group-hover:text-neutral-700\" style=\"text-decoration-line: none !important;\" href=\"#how-it-works\" aria-label=\"Anchor\"\u003e#\u003c/a\u003e\u003c/span\u003e\u003c/h2\u003e\u003cp\u003eEach service is defined by a unit file (e.g. \u003ccode\u003emysqld.service\u003c/code\u003e) that specifies the startup command, configuration file, dependencies and crash behaviour. In a multi-instance setup, separate unit files are created for each instance (e.g. \u003ccode\u003emysqld-app2.service\u003c/code\u003e, \u003ccode\u003emysqld-reporting.service\u003c/code\u003e), each with its own \u003ccode\u003e--defaults-file\u003c/code\u003e pointing to a different \u003ccode\u003emy.cnf\u003c/code\u003e.\u003c/p\u003e","title":"systemd"},{"content":"A Tablespace is the logical unit of storage organisation in Oracle Database. Each tablespace comprises one or more physical datafiles on disk, and every database object (table, index, partition) resides in a tablespace.\nHow it works #Oracle separates logical management (tablespace) from physical management (datafile). A DBA can create dedicated tablespaces for different purposes: one for active data, one for indexes, one for archive. This enables distributing I/O load across different disks and applying differentiated management policies (e.g. read-only for historical data).\nWhat it\u0026rsquo;s for #In the partitioning context, tablespaces enable advanced lifecycle management strategies: moving old partitions to economical archive tablespaces, setting them to read-only to reduce backup load, and reclaiming active space without deleting data. An ALTER TABLE MOVE PARTITION ... TABLESPACE ts_archive is a DDL operation that takes less than a second.\nWhen to use it #Every Oracle installation uses tablespaces. Tablespace design becomes critical when managing tables of hundreds of GB with partitioning, because good distribution across separate tablespaces enables efficient incremental backups and data lifecycle management.\n","date":"1 January 0001","permalink":"https://ivanluminaria.com/en/glossary/tablespace/","section":"Glossary","summary":"\u003cp\u003eA \u003cstrong\u003eTablespace\u003c/strong\u003e is the logical unit of storage organisation in Oracle Database. Each tablespace comprises one or more physical datafiles on disk, and every database object (table, index, partition) resides in a tablespace.\u003c/p\u003e\n\u003ch2 id=\"how-it-works\" class=\"relative group\"\u003eHow it works \u003cspan class=\"absolute top-0 w-6 transition-opacity opacity-0 -start-6 not-prose group-hover:opacity-100\"\u003e\u003ca class=\"group-hover:text-primary-300 dark:group-hover:text-neutral-700\" style=\"text-decoration-line: none !important;\" href=\"#how-it-works\" aria-label=\"Anchor\"\u003e#\u003c/a\u003e\u003c/span\u003e\u003c/h2\u003e\u003cp\u003eOracle separates logical management (tablespace) from physical management (datafile). A DBA can create dedicated tablespaces for different purposes: one for active data, one for indexes, one for archive. This enables distributing I/O load across different disks and applying differentiated management policies (e.g. read-only for historical data).\u003c/p\u003e","title":"Tablespace"},{"content":"THP (Transparent Huge Pages) is a Linux kernel feature that automatically promotes 4 KB memory pages to 2 MB in the background, without explicit configuration. Unlike static Huge Pages, they are managed by the khugepaged kernel process.\nHow it works #When active (default always), the kernel attempts to compact normal pages into huge pages in the background. The khugepaged process works continuously to find and merge groups of contiguous pages, causing unpredictable micro-freezes during compaction operations.\nWhy it matters #For Oracle they are a disaster. Oracle states explicitly in its documentation: disable THP. The \u0026ldquo;freezes of a few seconds\u0026rdquo; that users complain about are often caused by khugepaged. They are disabled with echo never \u0026gt; /sys/kernel/mm/transparent_hugepage/enabled and via GRUB for persistence across reboots.\nWhat can go wrong #The confusion between Huge Pages (good for Oracle, statically configured) and THP (harmful for Oracle, active by default) is one of the most common mistakes. A DBA who sees \u0026ldquo;Huge Pages\u0026rdquo; in the documentation and doesn\u0026rsquo;t disable THP is making things worse instead of better.\n","date":"1 January 0001","permalink":"https://ivanluminaria.com/en/glossary/thp/","section":"Glossary","summary":"\u003cp\u003e\u003cstrong\u003eTHP\u003c/strong\u003e (Transparent Huge Pages) is a Linux kernel feature that automatically promotes 4 KB memory pages to 2 MB in the background, without explicit configuration. Unlike static Huge Pages, they are managed by the \u003ccode\u003ekhugepaged\u003c/code\u003e kernel process.\u003c/p\u003e\n\u003ch2 id=\"how-it-works\" class=\"relative group\"\u003eHow it works \u003cspan class=\"absolute top-0 w-6 transition-opacity opacity-0 -start-6 not-prose group-hover:opacity-100\"\u003e\u003ca class=\"group-hover:text-primary-300 dark:group-hover:text-neutral-700\" style=\"text-decoration-line: none !important;\" href=\"#how-it-works\" aria-label=\"Anchor\"\u003e#\u003c/a\u003e\u003c/span\u003e\u003c/h2\u003e\u003cp\u003eWhen active (default \u003ccode\u003ealways\u003c/code\u003e), the kernel attempts to compact normal pages into huge pages in the background. The \u003ccode\u003ekhugepaged\u003c/code\u003e process works continuously to find and merge groups of contiguous pages, causing unpredictable micro-freezes during compaction operations.\u003c/p\u003e","title":"THP"},{"content":"Timeboxing is a time management technique that assigns a fixed, non-negotiable time interval to an activity. When time runs out, the activity ends — regardless of whether it has been completed or not.\nHow it works #A maximum duration is defined (15 minutes for a standup, 1 hour for a design meeting, 2 weeks for a sprint) and the constraint is respected. The timebox forces people to focus on the essential, avoiding endless discussions and paralyzing perfectionism.\nWhat it\u0026rsquo;s for #In project management, timeboxing underpins the standup meeting (15 minutes maximum), the Scrum sprint (fixed duration), and any well-managed meeting. Without a time constraint, meetings expand to fill all available time — Parkinson\u0026rsquo;s law applied to meeting rooms.\nWhy it matters #A 15-minute standup costs the team 440 hours per year. A 45-minute one costs 1,320. The difference — 880 hours — equals nearly 5 person-months. Timeboxing is not rigidity: it is respect for people\u0026rsquo;s time.\n","date":"1 January 0001","permalink":"https://ivanluminaria.com/en/glossary/timeboxing/","section":"Glossary","summary":"\u003cp\u003e\u003cstrong\u003eTimeboxing\u003c/strong\u003e is a time management technique that assigns a fixed, non-negotiable time interval to an activity. When time runs out, the activity ends — regardless of whether it has been completed or not.\u003c/p\u003e\n\u003ch2 id=\"how-it-works\" class=\"relative group\"\u003eHow it works \u003cspan class=\"absolute top-0 w-6 transition-opacity opacity-0 -start-6 not-prose group-hover:opacity-100\"\u003e\u003ca class=\"group-hover:text-primary-300 dark:group-hover:text-neutral-700\" style=\"text-decoration-line: none !important;\" href=\"#how-it-works\" aria-label=\"Anchor\"\u003e#\u003c/a\u003e\u003c/span\u003e\u003c/h2\u003e\u003cp\u003eA maximum duration is defined (15 minutes for a standup, 1 hour for a design meeting, 2 weeks for a sprint) and the constraint is respected. The timebox forces people to focus on the essential, avoiding endless discussions and paralyzing perfectionism.\u003c/p\u003e","title":"Timeboxing"},{"content":"Transport lag is the delay between when the primary database generates a redo log and when that redo log is received by the standby database in an Oracle Data Guard configuration. It\u0026rsquo;s one of the most important indicators for assessing replication health.\nHow it\u0026rsquo;s measured #Transport lag is monitored via a query on the V$DATAGUARD_STATS view or through Data Guard Broker:\nDGMGRL\u0026gt; SHOW DATABASE 'standby_db' 'TransportLagTarget'; The value is expressed in time format (e.g., +00 00:00:03 = 3 seconds of delay). A transport lag of a few seconds is normal in Maximum Performance mode; a lag that consistently grows indicates a bandwidth or redo generate rate problem.\nDifference from Apply Lag # Metric What it measures Transport Lag Delay in transmitting redo from primary to standby Apply Lag Delay in applying redo on the standby after reception Transport lag depends on the network (bandwidth, latency); apply lag depends on standby resources (CPU, I/O). In cross-site migrations, transport lag is the most common bottleneck.\nImpact in migrations #During a cross-site Data Guard migration, transport lag must be carefully monitored during peak load phases (nightly batches, activity spikes). A redo generate rate that exceeds VPN capacity produces a growing transport lag. Before cutover, the transport lag must be close to zero to ensure the switchover happens without data loss.\n","date":"1 January 0001","permalink":"https://ivanluminaria.com/en/glossary/transport-lag/","section":"Glossary","summary":"\u003cp\u003e\u003cstrong\u003eTransport lag\u003c/strong\u003e is the delay between when the primary database generates a redo log and when that redo log is received by the standby database in an Oracle Data Guard configuration. It\u0026rsquo;s one of the most important indicators for assessing replication health.\u003c/p\u003e\n\u003ch2 id=\"how-its-measured\" class=\"relative group\"\u003eHow it\u0026rsquo;s measured \u003cspan class=\"absolute top-0 w-6 transition-opacity opacity-0 -start-6 not-prose group-hover:opacity-100\"\u003e\u003ca class=\"group-hover:text-primary-300 dark:group-hover:text-neutral-700\" style=\"text-decoration-line: none !important;\" href=\"#how-its-measured\" aria-label=\"Anchor\"\u003e#\u003c/a\u003e\u003c/span\u003e\u003c/h2\u003e\u003cp\u003eTransport lag is monitored via a query on the \u003ccode\u003eV$DATAGUARD_STATS\u003c/code\u003e view or through Data Guard Broker:\u003c/p\u003e","title":"Transport Lag"},{"content":"Unified Audit (Oracle Unified Auditing) is the centralized auditing system introduced in Oracle Database 12c that replaces legacy audit mechanisms with a single unified infrastructure. All audit events converge in the UNIFIED_AUDIT_TRAIL view.\nHow it works #Unified Audit is based on audit policies: declarative rules that specify which actions to monitor (DDL, DML, logins, administrative operations). Policies are created with CREATE AUDIT POLICY, enabled with ALTER AUDIT POLICY ... ENABLE, and can be applied to specific users or globally. Audit records are written to an internal queue and then persisted in the system table.\nWhat it\u0026rsquo;s for #It answers the fundamental security question: \u0026ldquo;who did what, when, and from where?\u0026rdquo; It enables tracking of critical operations such as DROP TABLE, GRANT, REVOKE, access to sensitive data, and failed login attempts. It is essential for compliance (GDPR, SOX, PCI-DSS) and post-incident investigations.\nWhy it matters #Oracle\u0026rsquo;s legacy traditional audit fragmented logs across OS files, the SYS.AUD$ table, and FGA_LOG$, making analysis complex. Unified Audit centralizes everything into a single point with better performance and simplified management. In an environment with no audit configured, a security incident becomes impossible to reconstruct.\n","date":"1 January 0001","permalink":"https://ivanluminaria.com/en/glossary/unified-audit/","section":"Glossary","summary":"\u003cp\u003e\u003cstrong\u003eUnified Audit\u003c/strong\u003e (Oracle Unified Auditing) is the centralized auditing system introduced in Oracle Database 12c that replaces legacy audit mechanisms with a single unified infrastructure. All audit events converge in the \u003ccode\u003eUNIFIED_AUDIT_TRAIL\u003c/code\u003e view.\u003c/p\u003e\n\u003ch2 id=\"how-it-works\" class=\"relative group\"\u003eHow it works \u003cspan class=\"absolute top-0 w-6 transition-opacity opacity-0 -start-6 not-prose group-hover:opacity-100\"\u003e\u003ca class=\"group-hover:text-primary-300 dark:group-hover:text-neutral-700\" style=\"text-decoration-line: none !important;\" href=\"#how-it-works\" aria-label=\"Anchor\"\u003e#\u003c/a\u003e\u003c/span\u003e\u003c/h2\u003e\u003cp\u003eUnified Audit is based on \u003cstrong\u003eaudit policies\u003c/strong\u003e: declarative rules that specify which actions to monitor (DDL, DML, logins, administrative operations). Policies are created with \u003ccode\u003eCREATE AUDIT POLICY\u003c/code\u003e, enabled with \u003ccode\u003eALTER AUDIT POLICY ... ENABLE\u003c/code\u003e, and can be applied to specific users or globally. Audit records are written to an internal queue and then persisted in the system table.\u003c/p\u003e","title":"Unified Audit"},{"content":"A Unix Socket (or Unix domain socket) is a communication endpoint that allows two processes on the same operating system to exchange data without going through the TCP/IP network stack. In MySQL, it is the default connection method when connecting to localhost.\nHow it works #When a MySQL client connects specifying -h localhost, the client does not use TCP. It uses the Unix socket file (typically /var/run/mysqld/mysqld.sock) to communicate directly with the MySQL server process. This communication happens entirely within the kernel, with no network overhead, and is faster than a TCP connection even on the same host.\nWhat it\u0026rsquo;s for #In multi-instance environments, each MySQL instance has its own socket file (e.g. mysqld.sock, mysqld-app2.sock). Specifying the correct socket with --socket=/path/to/socket is the only reliable way to connect to the intended instance. Without specifying the socket, the client uses the default one — which almost always points to the primary instance.\nWhen to use it #Unix sockets are used for all local connections to MySQL. In environments with multiple instances, it is essential to explicitly specify the socket for each connection. For remote connections (from another host), TCP with -h \u0026lt;ip\u0026gt; -P \u0026lt;port\u0026gt; is used instead.\n","date":"1 January 0001","permalink":"https://ivanluminaria.com/en/glossary/unix-socket/","section":"Glossary","summary":"\u003cp\u003eA \u003cstrong\u003eUnix Socket\u003c/strong\u003e (or Unix domain socket) is a communication endpoint that allows two processes on the same operating system to exchange data without going through the TCP/IP network stack. In MySQL, it is the default connection method when connecting to \u003ccode\u003elocalhost\u003c/code\u003e.\u003c/p\u003e\n\u003ch2 id=\"how-it-works\" class=\"relative group\"\u003eHow it works \u003cspan class=\"absolute top-0 w-6 transition-opacity opacity-0 -start-6 not-prose group-hover:opacity-100\"\u003e\u003ca class=\"group-hover:text-primary-300 dark:group-hover:text-neutral-700\" style=\"text-decoration-line: none !important;\" href=\"#how-it-works\" aria-label=\"Anchor\"\u003e#\u003c/a\u003e\u003c/span\u003e\u003c/h2\u003e\u003cp\u003eWhen a MySQL client connects specifying \u003ccode\u003e-h localhost\u003c/code\u003e, the client does not use TCP. It uses the Unix socket file (typically \u003ccode\u003e/var/run/mysqld/mysqld.sock\u003c/code\u003e) to communicate directly with the MySQL server process. This communication happens entirely within the kernel, with no network overhead, and is faster than a TCP connection even on the same host.\u003c/p\u003e","title":"Unix Socket"},{"content":"VACUUM is the PostgreSQL command that reclaims space occupied by dead tuples and makes it available for new inserts. It does not return space to the operating system, does not reorganize the table, and does not compact anything — it marks pages as rewritable.\nHow it works #VACUUM table scans the table, identifies dead tuples no longer visible to any transaction, and marks their space as reusable. It is a lightweight operation that does not block writes and can run in parallel with normal queries. VACUUM FULL instead physically rewrites the entire table with an exclusive lock — to be used very rarely and only in emergencies.\nWhat it\u0026rsquo;s for #Without VACUUM, tables with heavy UPDATE and DELETE traffic accumulate dead tuples that occupy disk space and slow down sequential scans. VACUUM is the essential cleanup mechanism that balances the cost of PostgreSQL\u0026rsquo;s MVCC model.\nWhy it matters #Autovacuum runs VACUUM automatically, but with PostgreSQL defaults it may trigger too infrequently on high-traffic tables. On a table with 10 million rows, the default waits for 2 million dead tuples before acting — enough to visibly degrade performance.\n","date":"1 January 0001","permalink":"https://ivanluminaria.com/en/glossary/vacuum/","section":"Glossary","summary":"\u003cp\u003e\u003cstrong\u003eVACUUM\u003c/strong\u003e is the PostgreSQL command that reclaims space occupied by dead tuples and makes it available for new inserts. It does not return space to the operating system, does not reorganize the table, and does not compact anything — it marks pages as rewritable.\u003c/p\u003e\n\u003ch2 id=\"how-it-works\" class=\"relative group\"\u003eHow it works \u003cspan class=\"absolute top-0 w-6 transition-opacity opacity-0 -start-6 not-prose group-hover:opacity-100\"\u003e\u003ca class=\"group-hover:text-primary-300 dark:group-hover:text-neutral-700\" style=\"text-decoration-line: none !important;\" href=\"#how-it-works\" aria-label=\"Anchor\"\u003e#\u003c/a\u003e\u003c/span\u003e\u003c/h2\u003e\u003cp\u003e\u003ccode\u003eVACUUM table\u003c/code\u003e scans the table, identifies dead tuples no longer visible to any transaction, and marks their space as reusable. It is a lightweight operation that does not block writes and can run in parallel with normal queries. \u003ccode\u003eVACUUM FULL\u003c/code\u003e instead physically rewrites the entire table with an exclusive lock — to be used very rarely and only in emergencies.\u003c/p\u003e","title":"VACUUM"},{"content":"Vendor Lock-in is the situation where a company becomes dependent on an external supplier to the point that switching becomes extremely costly or technically complex. In the IT context, it occurs when the code, architecture or system knowledge are in the supplier\u0026rsquo;s hands, not the client\u0026rsquo;s.\nHow it works #Lock-in establishes itself gradually: the supplier writes code with their own conventions, uses proprietary or undocumented technologies, and the internal team is not involved in development. When the supplier leaves — by choice or dismissal — they take the know-how with them. The client is left with software they don\u0026rsquo;t understand, can\u0026rsquo;t maintain and can\u0026rsquo;t evolve without re-engaging the same supplier or starting from scratch.\nWhat it\u0026rsquo;s for #Understanding vendor lock-in is essential for making strategic decisions about outsourcing and software development. Every project should include mitigation measures: internal documentation, code reviews, internal team involvement, source code ownership.\nWhen to use it #The term describes a risk to avoid. The main countermeasures are: keeping critical know-how internally, preferring open and standard technologies, ensuring intellectual property ownership of the code, and always evaluating the \u0026ldquo;buy vs build\u0026rdquo; option before starting large-scale custom projects.\n","date":"1 January 0001","permalink":"https://ivanluminaria.com/en/glossary/vendor-lock-in/","section":"Glossary","summary":"\u003cp\u003e\u003cstrong\u003eVendor Lock-in\u003c/strong\u003e is the situation where a company becomes dependent on an external supplier to the point that switching becomes extremely costly or technically complex. In the IT context, it occurs when the code, architecture or system knowledge are in the supplier\u0026rsquo;s hands, not the client\u0026rsquo;s.\u003c/p\u003e\n\u003ch2 id=\"how-it-works\" class=\"relative group\"\u003eHow it works \u003cspan class=\"absolute top-0 w-6 transition-opacity opacity-0 -start-6 not-prose group-hover:opacity-100\"\u003e\u003ca class=\"group-hover:text-primary-300 dark:group-hover:text-neutral-700\" style=\"text-decoration-line: none !important;\" href=\"#how-it-works\" aria-label=\"Anchor\"\u003e#\u003c/a\u003e\u003c/span\u003e\u003c/h2\u003e\u003cp\u003eLock-in establishes itself gradually: the supplier writes code with their own conventions, uses proprietary or undocumented technologies, and the internal team is not involved in development. When the supplier leaves — by choice or dismissal — they take the know-how with them. The client is left with software they don\u0026rsquo;t understand, can\u0026rsquo;t maintain and can\u0026rsquo;t evolve without re-engaging the same supplier or starting from scratch.\u003c/p\u003e","title":"Vendor Lock-in"},{"content":"Version Control is a system that records every change to a project\u0026rsquo;s files, maintaining a complete history of who changed what, when and why. Git is the most widely used version control system in the world.\nHow it works #Every change is recorded as a \u0026ldquo;commit\u0026rdquo; with a descriptive message, an author and a timestamp. The system maintains the project\u0026rsquo;s entire history: you can return to any previous version, compare different versions and understand how the code evolved over time. With Git, every developer has a complete copy of the history on their own computer.\nWhat it\u0026rsquo;s for #Without version control, code lives on shared folders where accidental overwrites are the norm and nobody knows which is the \u0026ldquo;good\u0026rdquo; version. With version control, every change is tracked and reversible, conflicts between developers are managed in a structured way, and the project\u0026rsquo;s history is a resource, not a mystery.\nWhen to use it #Always, on any software project with more than one file or more than one developer. The absence of version control is the first sign of a project out of control. GitHub, GitLab and Bitbucket are platforms that add collaboration (Pull Requests, Issue tracker) on top of Git.\n","date":"1 January 0001","permalink":"https://ivanluminaria.com/en/glossary/version-control/","section":"Glossary","summary":"\u003cp\u003e\u003cstrong\u003eVersion Control\u003c/strong\u003e is a system that records every change to a project\u0026rsquo;s files, maintaining a complete history of who changed what, when and why. Git is the most widely used version control system in the world.\u003c/p\u003e\n\u003ch2 id=\"how-it-works\" class=\"relative group\"\u003eHow it works \u003cspan class=\"absolute top-0 w-6 transition-opacity opacity-0 -start-6 not-prose group-hover:opacity-100\"\u003e\u003ca class=\"group-hover:text-primary-300 dark:group-hover:text-neutral-700\" style=\"text-decoration-line: none !important;\" href=\"#how-it-works\" aria-label=\"Anchor\"\u003e#\u003c/a\u003e\u003c/span\u003e\u003c/h2\u003e\u003cp\u003eEvery change is recorded as a \u0026ldquo;commit\u0026rdquo; with a descriptive message, an author and a timestamp. The system maintains the project\u0026rsquo;s entire history: you can return to any previous version, compare different versions and understand how the code evolved over time. With Git, every developer has a complete copy of the history on their own computer.\u003c/p\u003e","title":"Version Control"},{"content":"Wait Event is an Oracle Database diagnostic indicator that identifies why a session is waiting rather than actively working. Whenever a process cannot proceed — because it is waiting for a block from disk, a lock, a network response or a CPU slot — Oracle records a specific wait event.\nThe most common # Wait Event Meaning db file sequential read Single-block read — typical of index access db file scattered read Multi-block read — typical of full table scans log file sync Waiting for commit to redo log enq: TX - row lock contention Row lock conflict direct path read Direct read (bypassing buffer cache) What they\u0026rsquo;re for #Wait events are the foundation of Oracle\u0026rsquo;s diagnostic methodology. By analysing which events dominate DB time (via AWR or ASH) you can immediately identify the nature of the problem: I/O, contention, CPU or network.\nWhere to find them # Real-time: V$SESSION_WAIT, V$ACTIVE_SESSION_HISTORY Historical: AWR reports (Top Timed Foreground Events section), DBA_HIST_ACTIVE_SESS_HISTORY The DBA\u0026rsquo;s rule: don\u0026rsquo;t guess what\u0026rsquo;s slowing the database down — look at the wait events.\n","date":"1 January 0001","permalink":"https://ivanluminaria.com/en/glossary/wait-event/","section":"Glossary","summary":"\u003cp\u003e\u003cstrong\u003eWait Event\u003c/strong\u003e is an Oracle Database diagnostic indicator that identifies why a session is waiting rather than actively working. Whenever a process cannot proceed — because it is waiting for a block from disk, a lock, a network response or a CPU slot — Oracle records a specific wait event.\u003c/p\u003e\n\u003ch2 id=\"the-most-common\" class=\"relative group\"\u003eThe most common \u003cspan class=\"absolute top-0 w-6 transition-opacity opacity-0 -start-6 not-prose group-hover:opacity-100\"\u003e\u003ca class=\"group-hover:text-primary-300 dark:group-hover:text-neutral-700\" style=\"text-decoration-line: none !important;\" href=\"#the-most-common\" aria-label=\"Anchor\"\u003e#\u003c/a\u003e\u003c/span\u003e\u003c/h2\u003e\u003ctable\u003e\n  \u003cthead\u003e\n      \u003ctr\u003e\n          \u003cth\u003eWait Event\u003c/th\u003e\n          \u003cth\u003eMeaning\u003c/th\u003e\n      \u003c/tr\u003e\n  \u003c/thead\u003e\n  \u003ctbody\u003e\n      \u003ctr\u003e\n          \u003ctd\u003e\u003ccode\u003edb file sequential read\u003c/code\u003e\u003c/td\u003e\n          \u003ctd\u003eSingle-block read — typical of index access\u003c/td\u003e\n      \u003c/tr\u003e\n      \u003ctr\u003e\n          \u003ctd\u003e\u003ccode\u003edb file scattered read\u003c/code\u003e\u003c/td\u003e\n          \u003ctd\u003eMulti-block read — typical of full table scans\u003c/td\u003e\n      \u003c/tr\u003e\n      \u003ctr\u003e\n          \u003ctd\u003e\u003ccode\u003elog file sync\u003c/code\u003e\u003c/td\u003e\n          \u003ctd\u003eWaiting for commit to redo log\u003c/td\u003e\n      \u003c/tr\u003e\n      \u003ctr\u003e\n          \u003ctd\u003e\u003ccode\u003eenq: TX - row lock contention\u003c/code\u003e\u003c/td\u003e\n          \u003ctd\u003eRow lock conflict\u003c/td\u003e\n      \u003c/tr\u003e\n      \u003ctr\u003e\n          \u003ctd\u003e\u003ccode\u003edirect path read\u003c/code\u003e\u003c/td\u003e\n          \u003ctd\u003eDirect read (bypassing buffer cache)\u003c/td\u003e\n      \u003c/tr\u003e\n  \u003c/tbody\u003e\n\u003c/table\u003e\n\u003ch2 id=\"what-theyre-for\" class=\"relative group\"\u003eWhat they\u0026rsquo;re for \u003cspan class=\"absolute top-0 w-6 transition-opacity opacity-0 -start-6 not-prose group-hover:opacity-100\"\u003e\u003ca class=\"group-hover:text-primary-300 dark:group-hover:text-neutral-700\" style=\"text-decoration-line: none !important;\" href=\"#what-theyre-for\" aria-label=\"Anchor\"\u003e#\u003c/a\u003e\u003c/span\u003e\u003c/h2\u003e\u003cp\u003eWait events are the foundation of Oracle\u0026rsquo;s diagnostic methodology. By analysing which events dominate DB time (via AWR or ASH) you can immediately identify the nature of the problem: I/O, contention, CPU or network.\u003c/p\u003e","title":"Wait Event"},{"content":"WSREP (Write Set Replication) is the API and protocol that Galera Cluster uses for synchronous multi-master replication. Each transaction is captured as a \u0026ldquo;write set\u0026rdquo; (a set of row-level changes) and replicated to all cluster nodes before commit.\nHow it works #When a node executes a transaction, WSREP intercepts it at commit time, packages it as a write set and sends it to all cluster nodes via the group communication protocol. Each node performs a certification process: it verifies that the transaction doesn\u0026rsquo;t conflict with other concurrent transactions. If certification succeeds, all nodes apply the transaction. If it fails, the transaction is rolled back on the originating node.\nWhat it\u0026rsquo;s for #WSREP ensures all cluster nodes have the same data at all times (synchronous replication). Unlike traditional MySQL asynchronous replication, there is no lag between master and slave: when a transaction is committed on one node, it\u0026rsquo;s already present on all others.\nWhen to use it #WSREP is activated with the wsrep_on=ON parameter in MariaDB/Percona XtraDB Cluster configuration. Status variables starting with wsrep_ (such as wsrep_cluster_size, wsrep_cluster_status, wsrep_flow_control_paused) are the main indicators for monitoring cluster health.\n","date":"1 January 0001","permalink":"https://ivanluminaria.com/en/glossary/wsrep/","section":"Glossary","summary":"\u003cp\u003e\u003cstrong\u003eWSREP\u003c/strong\u003e (Write Set Replication) is the API and protocol that Galera Cluster uses for synchronous multi-master replication. Each transaction is captured as a \u0026ldquo;write set\u0026rdquo; (a set of row-level changes) and replicated to all cluster nodes before commit.\u003c/p\u003e\n\u003ch2 id=\"how-it-works\" class=\"relative group\"\u003eHow it works \u003cspan class=\"absolute top-0 w-6 transition-opacity opacity-0 -start-6 not-prose group-hover:opacity-100\"\u003e\u003ca class=\"group-hover:text-primary-300 dark:group-hover:text-neutral-700\" style=\"text-decoration-line: none !important;\" href=\"#how-it-works\" aria-label=\"Anchor\"\u003e#\u003c/a\u003e\u003c/span\u003e\u003c/h2\u003e\u003cp\u003eWhen a node executes a transaction, WSREP intercepts it at commit time, packages it as a write set and sends it to all cluster nodes via the group communication protocol. Each node performs a \u003cstrong\u003ecertification\u003c/strong\u003e process: it verifies that the transaction doesn\u0026rsquo;t conflict with other concurrent transactions. If certification succeeds, all nodes apply the transaction. If it fails, the transaction is rolled back on the originating node.\u003c/p\u003e","title":"WSREP"},{"content":"Yes-And is a communication technique from improvisational theatre, applied to project management to transform confrontational discussions into constructive conversations. The principle is simple: instead of negating someone\u0026rsquo;s proposal with \u0026ldquo;No, but\u0026hellip;\u0026rdquo;, you acknowledge it with \u0026ldquo;Yes, and\u0026hellip;\u0026rdquo; while adding your own contribution.\nHow it works #When someone proposes an idea, a \u0026ldquo;No\u0026rdquo; response triggers a defensive reaction and blocks the conversation. A \u0026ldquo;Yes, and\u0026hellip;\u0026rdquo; response acknowledges the validity of the proposal and extends it, keeping the dialogue open. It does not mean agreeing with everything — it means building on the other person\u0026rsquo;s proposal before redirecting it.\nWhen to use it #In project meetings where two positions clash, in code reviews where feedback risks sounding like criticism, in stakeholder negotiations where a blunt \u0026ldquo;no\u0026rdquo; burns relationships. It is particularly effective when diverse opinions need to converge toward a shared decision.\nWhen it doesn\u0026rsquo;t work #It does not apply to security concerns, process violations, or non-negotiable deadlines. If someone proposes removing authentication from the production database, the answer is \u0026ldquo;No\u0026rdquo;, period. Yes-And works with people acting in good faith; it does not work with those who only want to be right.\n","date":"1 January 0001","permalink":"https://ivanluminaria.com/en/glossary/yes-and/","section":"Glossary","summary":"\u003cp\u003e\u003cstrong\u003eYes-And\u003c/strong\u003e is a communication technique from improvisational theatre, applied to project management to transform confrontational discussions into constructive conversations. The principle is simple: instead of negating someone\u0026rsquo;s proposal with \u0026ldquo;No, but\u0026hellip;\u0026rdquo;, you acknowledge it with \u0026ldquo;Yes, and\u0026hellip;\u0026rdquo; while adding your own contribution.\u003c/p\u003e\n\u003ch2 id=\"how-it-works\" class=\"relative group\"\u003eHow it works \u003cspan class=\"absolute top-0 w-6 transition-opacity opacity-0 -start-6 not-prose group-hover:opacity-100\"\u003e\u003ca class=\"group-hover:text-primary-300 dark:group-hover:text-neutral-700\" style=\"text-decoration-line: none !important;\" href=\"#how-it-works\" aria-label=\"Anchor\"\u003e#\u003c/a\u003e\u003c/span\u003e\u003c/h2\u003e\u003cp\u003eWhen someone proposes an idea, a \u0026ldquo;No\u0026rdquo; response triggers a defensive reaction and blocks the conversation. A \u0026ldquo;Yes, and\u0026hellip;\u0026rdquo; response acknowledges the validity of the proposal and extends it, keeping the dialogue open. It does not mean agreeing with everything — it means building on the other person\u0026rsquo;s proposal before redirecting it.\u003c/p\u003e","title":"Yes-And"},{"content":"ZDM (Zero Downtime Migration) is the tool Oracle provides for automating Oracle database migrations to OCI (Oracle Cloud Infrastructure) or to higher-version on-premises databases. The name is somewhat optimistic — downtime isn\u0026rsquo;t zero, but it\u0026rsquo;s minimized.\nHow it works #ZDM is essentially an orchestrator that combines existing Oracle technologies under a single automated workflow. It supports two modes:\nPhysical migration (Data Guard-based): creates a standby of the source database on the target, synchronizes it via redo transport, then performs a switchover. Downtime in the order of minutes. Logical migration (Data Pump-based): performs logical export and import with incremental synchronization via GoldenGate or Data Pump. More flexible but slower. When to use it #ZDM is suited for standard migrations where source and target infrastructure are configured conventionally. The advantage is automation: it reduces the chance of human error in repetitive steps.\nWhen not to use it #For complex configurations — RAC with cross-engine DB links, non-standard external dependencies, PL/SQL procedures with HTTP calls — ZDM\u0026rsquo;s automation layer can become an obstacle. In these cases, configuring Data Guard manually provides more control over details and the sequence of operations.\nRequirements #ZDM requires a dedicated host (the \u0026ldquo;ZDM service host\u0026rdquo;) with SSH access to both the source database and the target. The source must be Oracle 11.2.0.4 or higher, and the target can be on OCI or on-premises.\n","date":"1 January 0001","permalink":"https://ivanluminaria.com/en/glossary/zdm/","section":"Glossary","summary":"\u003cp\u003e\u003cstrong\u003eZDM\u003c/strong\u003e (Zero Downtime Migration) is the tool Oracle provides for automating Oracle database migrations to OCI (Oracle Cloud Infrastructure) or to higher-version on-premises databases. The name is somewhat optimistic — downtime isn\u0026rsquo;t zero, but it\u0026rsquo;s minimized.\u003c/p\u003e\n\u003ch2 id=\"how-it-works\" class=\"relative group\"\u003eHow it works \u003cspan class=\"absolute top-0 w-6 transition-opacity opacity-0 -start-6 not-prose group-hover:opacity-100\"\u003e\u003ca class=\"group-hover:text-primary-300 dark:group-hover:text-neutral-700\" style=\"text-decoration-line: none !important;\" href=\"#how-it-works\" aria-label=\"Anchor\"\u003e#\u003c/a\u003e\u003c/span\u003e\u003c/h2\u003e\u003cp\u003eZDM is essentially an orchestrator that combines existing Oracle technologies under a single automated workflow. It supports two modes:\u003c/p\u003e\n\u003cul\u003e\n\u003cli\u003e\u003cstrong\u003ePhysical migration\u003c/strong\u003e (Data Guard-based): creates a standby of the source database on the target, synchronizes it via redo transport, then performs a switchover. Downtime in the order of minutes.\u003c/li\u003e\n\u003cli\u003e\u003cstrong\u003eLogical migration\u003c/strong\u003e (Data Pump-based): performs logical export and import with incremental synchronization via GoldenGate or Data Pump. More flexible but slower.\u003c/li\u003e\n\u003c/ul\u003e\n\u003ch2 id=\"when-to-use-it\" class=\"relative group\"\u003eWhen to use it \u003cspan class=\"absolute top-0 w-6 transition-opacity opacity-0 -start-6 not-prose group-hover:opacity-100\"\u003e\u003ca class=\"group-hover:text-primary-300 dark:group-hover:text-neutral-700\" style=\"text-decoration-line: none !important;\" href=\"#when-to-use-it\" aria-label=\"Anchor\"\u003e#\u003c/a\u003e\u003c/span\u003e\u003c/h2\u003e\u003cp\u003eZDM is suited for standard migrations where source and target infrastructure are configured conventionally. The advantage is automation: it reduces the chance of human error in repetitive steps.\u003c/p\u003e","title":"ZDM"}]