I have a master database which would be the cloud server that consisted of different schools.
Dashboard type that has the details of each school. Can edit their information and other data.
Now those schools are deployed to their corresponding school location which would be the local server.
Dashboard type that can only edit the specific school deployed in the local server. Can edit their information and other data.
Now what I want to happen is, to synchronize the
cloud to local server on their corresponding school if something is changed. That also goes for
local to cloud server.
Note: If you guys ever tried Evernote, that can edit the notes information on whatever device you’re using and still be able to
synchronize when you have internet or manually clicked synchronize.
When the local server doesn’t have internet connection and edited some data in school. Once the internet is up, the data from local and cloud server should be synchronize.
That’s the logic that I’m pursuing to have.
Would anyone shed some light for me where to start off? I couldn’t think of any solution that fit my problem.
I also think of using php to foreach loop all over the table and data that corresponds to current date and time. But I know that would be so bad.
Edited: I deleted references / posts of other SO questions regarding this matter.
The application pegs that I found are
- Local Server Computer: Windows 10 (Deployed in Schools)
- Cloud Server: Probably some dedicated hosting that uses
Not to be picky but, hopefully the answer would be you’re talking to a newbie to master to slave database process. I don’t have experience for this.
When we used to do this we would:
- Make sure every table we wanted to sync had datetime columns for Created; Modified; & Deleted. They would also have a boolean isDeleted column (so rather than physically delete records we would flag it to true and ignore it in queries). This means we could query for any records that have been deleted since a certain time and return an array of these deleted IDs.
- In each DB (Master and slave) create a table that stores the last successful sync datetime. In the master this table stores multiple records: 1 for each school, but in the slave it just needs 1 record – the last time it synced with the master.
In your case every so often each of the slaves would:
Call a webservice (a URL) of the master, lets say called ‘helloMaster’. It would pass in the school name (or some specific identifier), the last time they successfully synced with the master, authentication details (for security) and expect a response from the master of whether the master had any updates for the school since that datetime provided. Really the point here is just looking for an acknowledgement that the master available and listening (ie. the internet is still up).
Then, the slave would call another webservice, lets say called ‘sendUpdates’. It would again pass in the school name, last successful sync, (plus security authentication details) & three arrays for any added, updated and deleted records since last sync. The master just acknowledge receipt. If a receipt was acknowledged then the slave to move to step 3, otherwise the slave would try step 1 again after a pause of some duration. So now the Master has updates from the slave. Note: it is up to the master to decide how to merge any records if there are conflicts with its pending slave updates.
The slave then calls a webservice, lets say ‘getUpdates’. It passes in the school name, last successful sync, security authentication details, & the master then return to it three arrays for any added, updated and deleted records it has which the slave is expected to apply to its database.
Finally once the slave tries to update its records it will then notifies the master of success/failure through another webservice, say ‘updateStatus’. If successful then the master will return a new sync date for the slave to store (this will exactly match the date the master stores in its table). If it fails then the error is logged in the master and we go back to step 1 after a pause.
I have left out some detail out about error handling, getting the times accurate across all devices (there might be different time zones involved), and some other bits and pieces, but that’s the gist of it.
I may make refinements after thinking on it more (or others might edit my post).
Hope that helps at least.
I will suggest you to go with the Trivial Solution, which according to me is:
- Create a SQLlite or any database (MySQL or your choice) in local server
- Keep a always running thread which will be pinging (makes an API call) your Master database every 5 minutes (depends on how much delay is accepted)
- With that thread you can detect whether you’re connected to the internet or not.
If connected to internet
a) Send local changes with the request to master server, this master server is an application server, which will be capable to update changes of local machines in school (you received this changes by an API call) to the master database after certain validations according to your application usage.
b) Receive updated changes from the server after the API call, this changes are served after solving conflicts (like if data in school server was updated earlier than data updated in master database so which one you will accept based on your requirement).
If not connected to internet, keep storing changes in local database and reflect those changes in Application which is running in school, but when you get connected push those changes to master server and pull actual changes which is applicable from the master server.
This is complicated to do it by your own, but if the scale is small I will prefer to implement your own APIs for the database applications which will connect in this manner.
Better solution will be to use Google Firebase, which is a real time database which is asynchronously updated whenever there is change in any machine, but can cost you higher if its really not required. But yes it will really give you Evernote type realtime editing features for your database systems.
This is not a problem that can be solved by database replication.
Generally speaking, database replication can operate in one of two modes:
Master/slave replication, which is what MySQL uses. In this mode, all writes must be routed to a single “master” server, and all of the replica databases receive a feed of changes from the master.
This doesn’t suit your needs, as writes can only be made to the master. (Modifying one of the replicas directly would result in it becoming permanently out of sync with the master.)
Quorum-based replication, which is used by some newer databases. All database replicas connect to each other. So long as at least half of all replicas are connected (that is, the cluster has reached “quorum”), writes can be made to any of the active databases, and will be propagated to all of the other databases. A database that is not connected will be brought up to date when it joins the quorum.
This doesn’t suit your needs either, as a disconnected replica cannot be written to. Worse, having more than half of all replicas disconnect from the master would prevent the remaining databases from being written to either!
What you need is some sort of data synchronization solution. Any solution will require some logic — which you will have to write! — to resolve conflicts. (For instance, if a record is modified in the master database while a school’s local replica is disconnected, and the same record is also modified there, you will need some way to reconcile those differences.)
No need for any complicated setup or APIs. MySQL allows you to easily replicate your database. MySQL will ensure the replication is correctly and timely done and whenever internet is available. (and its fast too)
- Master – slave: Master edits slave reads or in other words one way synchronization from master to slave.
- Master – Master: Master1 edits master2 reads and edits or in other words two way synchronization. Both server will push and pull updates.
assuming your cloud server has schema for each school and each schema is accessible by its own username and password. i.e db_school1, db_school2
now you have the option to replicate only a selected database schema from your cloud to local master. In your case, school one’s local master will only “do replicate db_school1”
in case if you want to replicate only specific table, MySQL also has that option “replicate-do-table”
the actual replication process is very easy but can get very deep when you have different scenarios.
few things you want to take a note, server ids, different auto-increment value on each server to avoid conflicts with new records. i.e Master1 generates records on odd number, Master 2 on even numbers so there won’t be a duplicate primary key issues. Server down alerts/monitoring, error skipping
I’m not sure if you are on linux or windows, I’ve wrote simple c# application which checks if any of the master is not replicating or stopped for any reason and sends email. monitoring is crucial!
here some links for master master replication:
also worth reading this optimised tabl-level replication info:
hope this helps.
The original version of this answer proposed MongoDB; but with further reading MongoDB is not so reliable with dodgy internet connections. CouchDB is designed for offline documents, which is what you need – although it’s harder to get gong than MongoDB, unfortunately.
I’d suggest not using MySQL but deploy a document store designed for replication such as CouchDB – unless you go for the commercial MySQL clustering services.
Being a lover of the power of MySQL I find it hard to suggest you use something else, but in this case, you really should.
Here is why –
Problems using MySQL replication
Why MySQL had good replication (and that’s most likely what you should be using if you’re synchronizing a MySQL database – as recommended by others) there are some things to watch out for.
- “Unique Key” clashes will give you a massive headache; the most
likely cause of this is “Auto Incrementing” IDs that are common in
MySQL applications (don’t use them for syncing operation unless there
is a clear “read+write”->”read-only” relationship – which there isn’t
in your case.)
- Primary keys must be generated by each server but unique across all servers. Possibly by adding a mix of a server identifier and a unique ID for that server (Server1_1, Server1_2, Server1_3 etc will not clash with Server2_1)
- MySQL sync only supports on-way unless you look at their clustering solutions (https://www.mysql.com/products/cluster/).
Problems doing it “manually” with time stamping the record.
Another answer recommends keeping “Time Updated” records. While I’ve done this approach there are some big gotchas to be careful of.
- “Unique Key” clashes (as mentioned above; same problems – don’t use them except primary keys, and generate primary keys unique to the server)
- Multiple updates on multiple servers need to be precisely time-synced
and clashes handled according to rules. This can be a headache.
- What happens when updates are received way out-of-order; which fields have been updated, which weren’t? You probably don’t need to update the whole record, but how do you know?
- If you must, try one of the commercial solutions as mentioned in answers https://serverfault.com/questions/58829/how-to-keep-multiple-read-write-db-servers-in-sync and https://community.spiceworks.com/topic/352125-how-to-synchronize-multiple-mysql-servers and Strategy on synchronizing database from multiple locations to a central database and vice versa (etc – Google for more)
Problems doing it “manually” with journalling.
Journalling is keeping a separate record of what has changed and when. “Database X, Table Y, Field Z was updated to value A at time B” or “Table A had new record added with these details […]”. This allows you much finer control of what to update.
- if you look at database sync techniques, this is actually what is going on in the background; in MySQL’s case it keeps a binary log of the updates
- you only ever share the journal, never the original record.
- When another server receives a journal entry, if has a much greater picture of what has happened before/after and can replay updates and ensure you get the correct details.
- problems arise when the journalling/database get out of Sync (MySQL is actually a pain when this happens!). You need to have a “refresh” script ready to roll that sits outside the journalling that will sync the DB to the master.
- It’s complicated. So…
Solution: Using a document store designed for replication, e.g. MongoDB
Bearing all this that in mind, why not use a document store that already does all that for you? CouchDB has support and handles all the journalling and syncing (http://docs.couchdb.org/en/master/replication/protocol.html).
There are others out there, but I believe you’ll end up with less headaches and errors than with the other solutions.
Master to master replication in MySQL can be accomplished without key violations while using auto_increment. Here is a link that explains how.
If you have tables without primary keys I’m not sure what will happen (I always include auto_increment primary keys on tables)
The auto-increment-offset and auto-increment-increment effect the auto_increment values as shown in the config samples from the article…
server_id = 1 log_bin = /var/log/mysql/mysql-bin.log log_bin_index = /var/log/mysql/mysql-bin.log.index relay_log = /var/log/mysql/mysql-relay-bin relay_log_index = /var/log/mysql/mysql-relay-bin.index expire_logs_days = 10 max_binlog_size = 100M log_slave_updates = 1 auto-increment-increment = 2 auto-increment-offset = 1 server_id = 2 log_bin = /var/log/mysql/mysql-bin.log log_bin_index = /var/log/mysql/mysql-bin.log.index relay_log = /var/log/mysql/mysql-relay-bin relay_log_index = /var/log/mysql/mysql-relay-bin.index expire_logs_days = 10 max_binlog_size = 100M log_slave_updates = 1 auto-increment-increment = 2 auto-increment-offset = 2