Portal Home > Knowledgebase > Articles Database > A thought on forum clustering
A thought on forum clustering
Posted by Darvil, 04-26-2011, 02:56 PM |
This was just an idea I had and probably not viable but I thought I'd post here to see what the pros think.
Lets say I have 2 or more servers and I want to have a better uptime for the forum and spread the load. Lets say the forum is around 300 simultaneous users but most of mysql it is browsing and not writes to the database. Lets say I have a way to spread the load between the 2 servers.
I know the master to master mysql replication isn't viable if the servers are quite apart. What about using master to slave replication? If the forum software was modified (I have no idea how complicated this is) so that writes are done remotely to the master server and all the reads are done locally. In the case where the master goes down, the people who are on the slave server (and those that gets redirected to the slave server) won't be able to post anymore but they would still be able to browse the server (I think).
To me this seems like a viable way. What do you guys think about this? I know it probably would be a bad idea if the forum is a massive forum but what about a small one relatively speaking.
|
Posted by relichost, 04-26-2011, 03:00 PM |
Hi
Its a good idea, but like you say you need to send writes to one server and reads from both.
I believe Vbulletin has this built in as a feature.
Thanks
|
Posted by intelliServe, 04-26-2011, 03:01 PM |
I think, this would be very difficult todo.
You'd have to create alot of coding between master and slave.
It does seem possible though, ever better would be all servers in the same rack.
|
Posted by Darvil, 04-26-2011, 03:18 PM |
abtme,
Vbulletin has this feature?? Really.. thats quite incredible. I guess I will have to look it up but I feel like it wouldn't be builtin.
intelliServe,
Why would you have to create alot of coding between master and slave? You just have to put all the writes of the slave to the master db; things like registrations, forum posts and profile updates. But after thats posted to the master, the master will replicate the data to the slave so that should work ok in theory.
|
Posted by intelliServe, 04-26-2011, 03:28 PM |
It would need additional coding because you seem to be wanting to use it as a secondary with one server still reading if the write db is down or vice-versa.
This would mean you'd have to split the functions, these aren't done by default, however if vb includes this then it doesn't matter.
|
Posted by relichost, 04-26-2011, 03:31 PM |
Well im fairly sure it does... I now am doubting myself.
In the config.php it mentions slave... Im 99% sure of it..
|
Posted by eth00, 04-26-2011, 03:34 PM |
Yeah it does and has for awhile. Very handy feature.
You can also use memcached to help with things like sessions if you want.
You can also look at caching some of the content
|
Posted by Darvil, 04-26-2011, 03:34 PM |
abtme,
I think you might be right.
http://www.vbulletin.com/docs/html/config.php
But I don't exactly know how that works according to the config but looks like you can use it that way from looking at that.
If vbulletin does it, wouldn't invision power board do the same? Anyone know?
|
Posted by Darvil, 04-26-2011, 03:42 PM |
I literally just found this
http://community.invisionpower.com/f...-mysql-driver/
I guess you can do a similar thing with IPB using this mod.. But 75 is spendy.
|
Posted by eth00, 04-26-2011, 03:46 PM |
If $75 is a lot do you really need clustering or perhaps some optimization would do the trick for now? Generally when we start load balancing sites for performance they are already on some decently expensive hardware (but not always!)
|
Posted by Darvil, 04-26-2011, 03:49 PM |
eth00,
I don't need clustering done at all. My server is plenty powerful even if the forum grows 10 times bigger.
The thing is I have another server and probably one more in the future. I am just wanting to do this as a fun project to do.
I was reading the info on the 75 dollar mod above, but it looks like its not what I think it is.
I don't think in a setup like that the forum can survive if the master server dies. Its just distributing the mysql load but not the whole site. Thats actually not what I was thinking.
That makes me wonder if its the same thing with the vbulletin setup.
Last edited by Darvil; 04-26-2011 at 03:55 PM.
|
Posted by quantumphysics, 04-26-2011, 03:53 PM |
Are you already using multiple memcached servers?
|
Posted by wartungsfenster, 04-26-2011, 04:13 PM |
snicker*
I think generally people that can write well-distributed apps are not the ones writing forums :>
BUT!!! that doesn't make you wrong at all!
|
Posted by Darvil, 04-26-2011, 04:23 PM |
quantumphysics, not at all. As I've mentioned, I was thinking more of this as a wide distributed set of servers then a few local servers. I was just thinking about this and see if its viable thats all.
wartungsfenster,
Everyone gotta do something I suppose that means you agree it can be done?.. So where can I find one of these elite programmers who can write it up (but I probably can't afford them eh?.
|
Posted by quad3datwork, 04-26-2011, 04:45 PM |
Those of you with vB4 (single server). How much memory does your memcached take? I've tweak the memory setting and it still uses around ~90MB. No matter if I set memcached to use 128MB or 256MB.
|
Posted by kmonchamp, 04-26-2011, 07:56 PM |
Splitting reads and writes to a mysql server is not all that hard at all. You setup some type of mysql proxy that sends commands that makes writes to the master server and commands that read are sent to the slave. As long as you had a session aware load balancer in front of the two web servers it would work fairly well. It isn't perfect but it would give pretty good performance boosts to something that is more read intensive. A really large site would use custom written software and sharding.
Last edited by kmonchamp; 04-26-2011 at 07:57 PM.
Reason: spelling
|
Posted by Darvil, 04-26-2011, 08:19 PM |
can you elaborate on "some type of mysql proxy"?
although remember I'm thinking of 2 different servers located in different datacenters. Basically using a clustering DNS solution (nginx, varnish?) to route to the different servers. I guess that would mean its not session aware. I forgot about that. Perhaps store the sessions in the db?
Well the coding part will be an issue if everytime you upgrade the forum software you have to redo everything. Plus you would need to hire someone who knows the forum software intimately.
I like the idea of puting something in the front that can split the read and write so you won't have to really mess with the forum code. Try to keep it minimal work.
|
Posted by kmonchamp, 04-26-2011, 08:27 PM |
https://launchpad.net/mysql-proxy Is one of the programs that comes to mind. Mysql replication can work in two different datacenters although it can become out of sync. Semi-synchronous mysql replication was a recently added feature that makes geographic redundancy more viable and fixes out of sync issues.
|
Posted by Darvil, 04-26-2011, 10:50 PM |
kmonchamp,
Thanks for that link! Now I looked though that but I am not 100 percent sure. Let me see if I can guess how it would work with mysql-proxy.
Lets say I have 3 servers in 3 different datacenters. 1 master and 2 slaves. I won't need mysql-proxy infront of the master. I have mysql-proxy in front of the 2 slaves filtering the queries. Whenever there are write queries, the mysql-proxy can then redirect all those queries to the master server and let everything else go through (IE select) to the slaves.
Would that be how it would work? What about if the master isn't available? Can you script it to store the queries or does it just return a typical error on the forum.
That would be quite incredible if it works that way so you won't need to recode the forum script. Just have to install mysql-proxy and script it a lil.
|
Posted by kmonchamp, 04-26-2011, 11:22 PM |
The basic idea is that the proxy gives you one IP address for the entire mysql system. The mysql proxy then handles the separation. You would use mysql replication to replicate data between servers.
If you wanted a high availability system(automatic failover) you would setup the servers with multi-master replication(both servers can be read and written to, you would use mysql proxy to write to only one and if one is down it would just start using the other).
A note on mysql replication. Currently there are two types of mysql replication available, asynchronous and semi-synchronous. In asynchronous data is written to a log and then set to a slave. There is no guarantee that the slave receives the file. In new versions of mysql semi-synchronous replication was added to fix this problem. This guarantees the file was delivered to the slave. This version uses a bit more resources but is the best way to ensure better data redundancy if you want to use automatic fail-over. If you have less bandwidth then use asynchronous because it will very rarely loose data(really it will only loose data in the event the master server crashes right after it finishes processing a transaction). Its best to use a HA or load balanced systems where if some data is missing or delayed nothing bad would happen(example a forum post takes a bit extra time to show up on the other server or is lost during fail over) again semi-synchronous fixes most issues with data loss.
You can see that this can get really complicated really fast but is a good way to add improved availability/load without having to make any modifications to the software.
Mysql also has another project called cluster that has full synchronous replication and high redundancy. It also uses ram drives and then offloads data to a hard disk for improved speeds. However it is useless for anything but custom written software and has other limitations that would make it unsuitable for forums.
|
Posted by Darvil, 04-26-2011, 11:35 PM |
thanks kmonchamp,
But with master to master, you're talking about 2 local servers right? It wouldn't be viable to do this over large geographical servers (IE one in US, one in europe and one in Asia. If you say the proxy gives 1 IP, then I'm assuming that all reads and writes goes thru it which means the sites would browse slow. I was thinking more along the line if, if you were in europe, you would hit the european server with the dns cluster and then you can browse the forum on that server. When you write then, it goes to the master server instead of the slave.
I don't think the proxy will work well in this case, if what I am understanding is correct.
|
Posted by kmonchamp, 04-26-2011, 11:50 PM |
Yeah its not really the best option for geographic redundancy. Actually mysql is not really all that good for geographic redundancy with out custom written software, like I said custom applications use techniques such as sharding. Only mysql goes through one IP(it might be able to be setup in other ways as I use multi master semi-sync), the real performance benefit would be to increase load on the web servers and spread out php load. It would really help if your load problems were mysql.
And it really all depends on how perfect the replication needs to be?
What larger sites might do is use multi master load balanced locally and then use asynchronous replication to tie different geographic areas together.
EDIT EDIT EDIT EDIT
I was thinking a little more and seeing as each forum would be running on different servers you could run mysql proxy on all of them (even the master encase mysql goes down on that machine). I think that would work. Each forum just points to its own servers proxy.
Last edited by kmonchamp; 04-27-2011 at 12:03 AM.
|
Posted by Darvil, 04-27-2011, 04:17 AM |
kmonchamp,
Thats what I was thinking initially. All of them have their own mysql proxy but only the writes from the slaves would arrive at the masters. It wouldn't be good if all the reads and writes have to come to the master server as that would be pointless. The thing I found is that, at least on my end, most of the mysql traffic is just select. I was thinking if there was a local server (IE in europe) then the browser would be fast which would be for like 90 percent of the use.
But in this senario, what would happen if the master fails? Lets just say it got cut off. Those users hitting the master server would also get rerouted to the slaves. These slaves won't be able to update anymore (as the master is now down) but would they still run? at least be able to browse. I was wonder what happens in this senario. Will the forum just throw a regular db error if a user try to write? but the forum will still be alive and viewable.
Also how would sessions work in this case? I didn't think about it. I'm assuming those that were hitting the master server that just got rerouted to the slaves now, have to relogin to the forum. But would they be able to login as logging in itself needs a write to the database (IE last logged in field). Would that mean they can still browse the site but not login.
|
Posted by kmonchamp, 04-27-2011, 06:22 AM |
Mysql proxy is not exactly perfect in splitting read/writes at this time(it works fairly well, just somethings still end up going to the current master) so it would highly depend on how the software is written to determine if it would still work. What I would recommend is setup mysql in a circular replication(multi-master when more then two servers). So that the remaining two servers could still keep each other in sync. Or what ever script you use to watch for failure could connect to the mysql databases and change that servers master. (http://mysql-mmm.org/ that should point you in the right direction)
Realistically I would imagine that with session information you should have no problem. It gets recorded into the data base and the information will be replicated to all of the servers. If they end up having to connect to another server their same session info should work.
|
Posted by quad3datwork, 04-28-2011, 12:17 AM |
I was messing with circular multi-master setup w/ federated tables a while back. Thing just never worked during a failover process.
Thanks for pointing mysql-mmm out. Have you deployed in production and have good success with it?
|
Posted by Darvil, 04-28-2011, 02:20 AM |
Looks like there is no easy way to do what I want to do.
Now I'm assuming modifying the forum script would still work? change the script so all writes processes goes to the master instead of slave. Do you think this will work? Of course it does require some coding work.
|
Posted by kmonchamp, 04-28-2011, 02:27 AM |
I haven't really had enough use out of them yet to give you a good answer to that. From what I have looked into mysql-mmm does handle things rather well.
I would also take a look at the Percona(http://www.percona.com/) version of mysql. It has extra improvements to performance and is 100% compatible with mysql. It can help boost performance.
Yes modifying the software will make things work much more smoothly. Depending on the forum and can get rather complicated but I am sure it is doable.
|
Posted by Darvil, 04-28-2011, 02:46 AM |
kmonchamp,
How about for low traffic forums (around 250 users online at once time) and mostly reads. Do you think it might be viable to do master to master using the better replication tools you've mentioned above? This would be in different datacenters. The only thing I worry is if one of them lose network connection to each other.
|
Posted by kmonchamp, 04-28-2011, 03:23 AM |
I think that it can be done. MMM would watch the cluster and move the writable IPs around during a failover. You would need to use a VPN for that to work properly but you would want that anyways to make sure your mysql replication traffic is sent across the internet encrypted.(Either openvpn or L2tp/IPSEC). Mysql proxy would be installed in each location and direct writable requests to the virtual IP that mmm is using for write and send the reads to the virutal IP(s) for reads. Here is an article that I found that gives a bit more explanation (http://www.paulgraydon.co.uk/geeky/m...the-headaches/).
|
Posted by Darvil, 04-28-2011, 04:02 AM |
Thanks for you great tips. I will give it a go. This is probably much easier to do then modifying the forum script.
Also I just realized this could even be more handy for my video site. There are very few writes to the db and mostly streaming videos. Doing a similar thing with the db and rsyncing video files would work with servers from different geo regions.
|
Add to Favourites Print this Article
Also Read
DDOS Victims (Views: 801)
WHM CLIENTS (Views: 827)