• Log in with Facebook Log in with Twitter Log In with Google      Sign In    
  • Create Account
  LongeCity
              Advocacy & Research for Unlimited Lifespans

Photo
- - - - -

Technical Issues


  • This topic is locked This topic is locked
87 replies to this topic

#31 davidd

  • Guest, F@H
  • 328 posts
  • 1
  • Location:Minnesota

Posted 24 February 2009 - 09:43 PM

This may make it easier to understand some of what I mentioned in my earlier posts. The first is the response times for the main forum page. As you can see, the response times are grouped, horizontally, around multiples of 10 seconds. You will also see some vertical tower artifacts. These aren't as easy to see on here as on the second graph, because the delays are so long that it spaces out the data points when it happens.

Attached File  20090224_main_forum_page_performance.JPG   108.49KB   18 downloads


The second is the response times for the imminst.org home page. The two main things you'll notice about this graph is the vertical towers, meaning the performance degraded around that time period, and the fact that the points aren't clustered horizontally. This is more typical of what you would see when looking at response times for a system that has varying load throughout the day. You'll also notice that the response time is better in the middle of the night (except for the two towers), when most people (in the USA) are sleeping. You'll also notice that the *vast* majority of the requests come back between half a second and 1 second. When users are hitting the system harder, we start to get some in the 1-2 second range. Of course, this isn't a full 24 hour snapshot either.

Attached File  20090224_imminst_home_page_performance.JPG   118.36KB   16 downloads


David

#32 davidd

  • Guest, F@H
  • 328 posts
  • 1
  • Location:Minnesota

Posted 25 February 2009 - 07:28 PM

Question: What are we doing with our test system license? I looked at the licensing for IP.Board and it appears we should have a production instance and a test instance. Can we install the test instance somewhere and give people access to that instance to troubleshoot/debug/enhance the software?

David

#33 davidd

  • Guest, F@H
  • 328 posts
  • 1
  • Location:Minnesota

Posted 25 February 2009 - 07:30 PM

Since this thread is about technical issues and how to solve them, I should mention that someone will need to decide how the changes will be made to the software. There are four main ways:

1) Create a "Mod" (or a complete module, for that matter) for IP.Board and install it over the top of our installation. By doing this, it can be rolled back later if it isn't compatible with a future release of IP.Board or if that future release fixes whatever bug we were fixing or includes the functionality we added. If the future version of IP.Board is compatible with the Mod or we make the Mod compatible, then we can again apply the Mod after upgrading the IP.Board version.

2) Hack directly into the software we have installed. By doing this, we may run into issues with upgrades and it will be more difficult to keep track of what has been changed over time. There won't be an easy way to roll back the changes.

3) Use a Mod that someone else has already created. I'm assuming that is one of the reasons that IP.Board was chosen, because it has a large installed base, hence there are a lot of Mods that others have created. Someone may want to review the Mods that are available to see if the functionality they provide will fit with some of the feature requests we want to implement. As mentioned in (1), these can be applied/rolled back as needed (assuming the Mod creator followed recommended practices).

4) Create pages that integrate in with the regular IP.Board pages, but use their own mechanisms to provide functionality/user interface/data. This mechanism may still be broken by an upgrade, but the code could be kept modular (if done well) such that pieces that don't work could be removed and others left in place.


I believe we'll be better off if we make an informed decision about how to make changes before we start making them. :)

Personally, I'm biased toward (1) and (3), with (4) only being used if the IPB framework can't accomplish something.

David

#34 davidd

  • Guest, F@H
  • 328 posts
  • 1
  • Location:Minnesota

Posted 25 February 2009 - 07:50 PM

I promised a director that I would post about another idea in this thread, so here goes...

Although the website (both forum and non-forum) does provide a lot of functionality/information, I think the ramp up time for prospective members and new members is high. I recommend we reach out to members who may have website design/application user interface design/"user experience"/usability backgrounds to facilitate possible changes to the site (both forum and non-forum).

I'm not speaking about functionality. Many of us can make suggestions on functionality. For example, "ability to see N most viewed posts", would be a functional request. However, how that functionality is accessed/displayed/integrated with the rest of the site benefits from the experience/vision of more of a usability minded person (or persons).

So, just like you are putting together a list of programmers who can assist with making changes to the site, I recommend a separate list (which may include some of the same people) for designers/usability professionals. I'm not suggesting that these people would be making changes at will. I'm visualizing them creating some mock-ups and then people could vote on new usability changes.

Although the developers are sometimes responsible for the usability design of applications, often times there are specialists that focus just on the usability. This is because it is often a different skill set that the developers may not have and it allows the developers to focus on enhancement requests and bug fixes, rather than the ease-of-use aspect. The usability experts would typically work with software architects. In this situation, there may not be a need to break out the architect role, since this is a comparatively simple system. I'm guessing the architect role is more of a design by committee model here, which isn't usually the best way to design things, but will probably suffice in this case.

David

#35 davidd

  • Guest, F@H
  • 328 posts
  • 1
  • Location:Minnesota

Posted 25 February 2009 - 11:51 PM

Referring back to my earlier analysis on the 10 second response time delays...

To be clear, 10 seconds or some multiple thereof is the maximum delay a forum user would experience if another user is requesting something from a user profile. If user A is in a profile and user B is clicking around in the forums, the relationship is roughly as follows, for a few scenarios. These assume that only users A and B are clicking around in the site, because other users making requests could alter the outcome.

Scenario 1:

User A clicks a tab in the user profile
Less than a second elapses
User B makes a forum request

Result: User B will most likely have to wait about 10 seconds. (0 + 10 = 10)


Scenario 2:

User A clicks on a tab in the user profile
5 seconds elapses
User B makes a forum request

Result: User B will most likely have to wait about 5 seconds. (5 + 5 = 10)

Scenario 3:

User A clicks on a tab in the user profile
8 seconds elapses
User B makes a forum request

Result: User B will most likely have to wait about 2 seconds. (8 + 2 = 10)


There *are* less common instances where user B doesn't wait at all, but most of the time he's/she's going to get stuck behind the 10 second request from user A.

Also, there are some user profiles that sometimes take 20 seconds or even 30 seconds to access. Usually (but not always), User B will get in on the next 10 second boundary and not have to wait for the full 20 or 30 seconds.

The lack of consistency is intriguing. In the cases where User B does NOT have to wait at all, even though User A is in the middle of a request, it makes me think that there may be some sort of thread/connection sharing going on and User B is getting lucky and getting assigned to the thread/connection that is available, whereas normally they get assigned to the one that is busy (for some reason). Either that, or the request that User A has issued doesn't get into the blocking state until part way into the request, so if User B gets in quickly enough behind User A, their query will not be blocked. That would lead me back to the possibility that table locks may be playing a roll in this.

I'll think about it a bit more and do some more research and report back.

David

#36 Mariusz

  • Guest
  • 164 posts
  • 0
  • Location:Hartford, CT

Posted 26 February 2009 - 12:22 AM

The lack of consistency is intriguing. In the cases where User B does NOT have to wait at all, even though User A is in the middle of a request, it makes me think that there may be some sort of thread/connection sharing going on and User B is getting lucky and getting assigned to the thread/connection that is available, whereas normally they get assigned to the one that is busy (for some reason). Either that, or the request that User A has issued doesn't get into the blocking state until part way into the request, so if User B gets in quickly enough behind User A, their query will not be blocked. That would lead me back to the possibility that table locks may be playing a roll in this.

Quick way  to check if locking is really an issue:


Start mysqld with --low-priority-updates. For storage engines that use only table-level locking (MyISAM, MEMORY, MERGE), this gives all statements that update (modify) a table lower priority than SELECT statements. In this case, the second SELECT statement in the preceding scenario would execute before the UPDATE statement, and would not need to wait for the first SELECT to finish.



David you did great job analyzing this issue, but truthfully it will all be just speculations until we have some more information about the server(bandwidth, free space, memory, processor, utilization), and  statistics (apache, sql, mta).

There is a lot of improvements in frontend alone, not to mention backend, but we can't do anything, simply because we are not trusthworthy according to the "Directors". :D

Mariusz

#37 davidd

  • Guest, F@H
  • 328 posts
  • 1
  • Location:Minnesota

Posted 26 February 2009 - 01:49 AM

Quick way to check if locking is really an issue:


Start mysqld with --low-priority-updates. For storage engines that use only table-level locking (MyISAM, MEMORY, MERGE), this gives all statements that update (modify) a table lower priority than SELECT statements. In this case, the second SELECT statement in the preceding scenario would execute before the UPDATE statement, and would not need to wait for the first SELECT to finish.


Changing the priorities may make it act differently, but it won't fix the issue. I didn't have any updates in my scenarios, so I'm not sure where that came from?

David you did great job analyzing this issue, but truthfully it will all be just speculations until we have some more information about the server(bandwidth, free space, memory, processor, utilization), and statistics (apache, sql, mta).


Those would all be nice pieces of information to have, but through my testing, I've eliminated those as issues. What I'm doing now is analyzing how the application is working and narrowing it down to a smaller list of possible causes so that if/when action is taken, fewer things will need to be looked at.

This may seem like *only* speculation, but it is actually how this type of thing is solved, regardless of the computer system. I often work with more complex, distributed systems where it isn't possible to have access to or time to collect all the details. Therefore, I have to run tests and then come up with hypothesis, run more tests, rule out certain things, etc. It is an applied, iterative form of the scientific method.

Initially it was a lot more speculative, since we didn't have enough data to work with. After I was able to reproduce the problem at will and collect some data (maestro949's idea), that provided at least a starting point to move from heavy speculation to a finer grained analysis.

Yes, it is much more efficient to have access to the system and have a lot more information to work with, but since we don't have that, I'm following it through as far as I can to see if a solution can be reached while we wait.

There is a lot of improvements in frontend alone, not to mention backend, but we can't do anything, simply because we are not trusthworthy according to the "Directors". :D

I'm assuming you mean functional improvements/bug fixes? Yeah, I agree that there are a lot of things that can be done, but this performance problem is affecting every single forum user, so I'm placing it as the highest priority. ;)


I've got an idea on how to possibly fix the issue. I'll PM Mind and see what can be done without the need to give out access at this juncture.

David

#38 Mariusz

  • Guest
  • 164 posts
  • 0
  • Location:Hartford, CT

Posted 26 February 2009 - 03:02 AM

Changing the priorities may make it act differently, but it won't fix the issue. I didn't have any updates in my scenarios, so I'm not sure where that came from?

Well, according to mysql docs "Table updates normally are considered to be more important than table retrievals, so they are given higher priority. This should ensure that updates to a table are not “starved” even if there is heavy SELECT activity for the table". 


I know it will not fix the issue but it may give us little more information about what is causing those problems. 

I'm assuming you mean functional improvements/bug fixes? Yeah, I agree that there are a lot of things that can be done, but this performance problem is affecting every single forum user, so I'm placing it as the highest priority. ;)


No, I was talking about performance related improvements. For example add en expiry headers, minify javascript, move css to an external file and set up gzip compression. This alone could decrease loading time by few percent. Add to this php cache and few other tricks and the difference will be significant.


Moving css (over 20kb of text sent needlessly to with each request) to an external file would save huge amount of bandwidth. I just don't think that it's a good idea to mess with default sql settings, without knowing exactly what is causing those problems and until you have all other issues resolved. 

Installing a small utility like Munin (http://munin.projects.linpro.no/) should be done first, before doing anything else with the server. 

Mariusz

#39 Mariusz

  • Guest
  • 164 posts
  • 0
  • Location:Hartford, CT

Posted 26 February 2009 - 03:07 AM

This may seem like *only* speculation, but it is actually how this type of thing is solved, regardless of the computer system. I often work with more complex, distributed systems where it isn't possible to have access to or time to collect all the details.


No offense, you did great job analyzing this issue, but I still think this is just speculation. It's like fixing a car based on sounds it makes. Sometimes it works, sometimes it doesn't. For me more "scientific" way is to use log files and other utilities to find out what is causing the problem. 


Mariusz

#40 niner

  • Guest
  • 16,276 posts
  • 2,000
  • Location:Philadelphia

Posted 26 February 2009 - 04:02 AM

David, I think that was an impressive piece of detective work you've done so far. I'm interested in the user profiles problem. If it explains the observation of the 10 second intervals, then it seems to be pretty important. Could we disable user profiles or the data-intensive aspects of them for a little while to observe the effect on the system? Giving users the ability to store videos in their profile is not important enough to justify wrecking the forums. On the other hand, if every single touch of the profiles is causing the delay, then something is really broken and needs to be fixed, probably at the IPB end. We should involve them in this. Not to put too fine a point on it, but it's their job...

#41 davidd

  • Guest, F@H
  • 328 posts
  • 1
  • Location:Minnesota

Posted 26 February 2009 - 04:55 AM

Changing the priorities may make it act differently, but it won't fix the issue. I didn't have any updates in my scenarios, so I'm not sure where that came from?

Well, according to mysql docs "Table updates normally are considered to be more important than table retrievals, so they are given higher priority. This should ensure that updates to a table are not "starved" even if there is heavy SELECT activity for the table".


I know it will not fix the issue but it may give us little more information about what is causing those problems.

The problem is that this locking (assuming it is locking) is not a matter of incorrect prioritization of different types of DML. Whether it is a select or an update, once it gets in there, everything else stops behind it.

Now, having said that, in the future, after we fix this issue, we find that we have a lot of lock contention, then prioritization might help, assuming the updates are slower than the selects.

I'm assuming you mean functional improvements/bug fixes? Yeah, I agree that there are a lot of things that can be done, but this performance problem is affecting every single forum user, so I'm placing it as the highest priority. ;)


No, I was talking about performance related improvements. For example add en expiry headers, minify javascript, move css to an external file and set up gzip compression. This alone could decrease loading time by few percent. Add to this php cache and few other tricks and the difference will be significant.


Moving css (over 20kb of text sent needlessly to with each request) to an external file would save huge amount of bandwidth. I just don't think that it's a good idea to mess with default sql settings, without knowing exactly what is causing those problems and until you have all other issues resolved.


Those are all great ideas, Mariusz, so I hope you won't take my reply as being negative. However, I think we're going to find that the forum performance will be quite acceptable after we fix this blocking/locking issue. There is always tuning that can be done, but there is a point of diminishing returns at some point.

I think what we'll find is that we'll be okay after this issue is fixed, *until* we reach enough users where we start reaching the limits of our hardware. At that point, we can either spend money on hardware, or spend time on tuning to buy us more time on the current hardware. I'm not saying we'd have to wait until we are maxed out to do this tuning, but the time might be better spent on functional bug fixes and functional enhancements.

Installing a small utility like Munin (http://munin.projects.linpro.no/) should be done first, before doing anything else with the server.


What I'd generally recommend in a case like this is to use such a tool, along with a response time gathering tool (either something like I wrote or a general purpose tool) so that when we reach a certain threshold, then we can make a decision about whether to tune, upgrade software, or upgrade hardware. We could spend a lot of time tuning now (when it isn't needed, necessarily) only to find out that a new version of MySQL/IP.Board/Apache/etc. gives us the same efficiency gains for free.

To be clear, all the things you've recommended would be very appropriate for general performance issues where resource contention under load is the cause of performance degradation. We are dealing with a very special type of issue here. It is one that I always have to keep in the back of my mind when looking at system performance issues, but one that probably only actually happens 3 times out of 100. These types of latencies (blocking operations, timeouts, table locks, polling, limited number of threads/processes available) will often *appear* to be I/O related at first, since the outward symptoms are the same. The key to differentiating between one and the other is the distribution of the response times (either clustered like in the first graph I provided, when time between test requests is set relatively small compared to the response time delays, OR evenly distributed, when time between test requests is set larger than the response time delays (I didn't do that test yet, because the first was conclusive)). And, of course, if you have access to the I/O subsystem stats and they don't show a load, then that will also point you to these artificial latency issues.

David

Edited by davidd, 26 February 2009 - 03:18 PM.


#42 davidd

  • Guest, F@H
  • 328 posts
  • 1
  • Location:Minnesota

Posted 26 February 2009 - 04:58 AM

This may seem like *only* speculation, but it is actually how this type of thing is solved, regardless of the computer system. I often work with more complex, distributed systems where it isn't possible to have access to or time to collect all the details.


No offense, you did great job analyzing this issue, but I still think this is just speculation. It's like fixing a car based on sounds it makes. Sometimes it works, sometimes it doesn't. For me more "scientific" way is to use log files and other utilities to find out what is causing the problem.


I guess we'll just have to wait and see how speculative it was. ;)

David

#43 davidd

  • Guest, F@H
  • 328 posts
  • 1
  • Location:Minnesota

Posted 26 February 2009 - 05:11 AM

David, I think that was an impressive piece of detective work you've done so far. I'm interested in the user profiles problem. If it explains the observation of the 10 second intervals, then it seems to be pretty important. Could we disable user profiles or the data-intensive aspects of them for a little while to observe the effect on the system? Giving users the ability to store videos in their profile is not important enough to justify wrecking the forums. On the other hand, if every single touch of the profiles is causing the delay, then something is really broken and needs to be fixed, probably at the IPB end. We should involve them in this. Not to put too fine a point on it, but it's their job...


Heh, I figured someone might offer up disabling the functionality as a short term solution. ;)

Unfortunatley, the problem happens whether or not people have content in their profile. You can test this with my profile, which has no content. When you click on my username to go to my profile, you will experience a 10, 20 or 30 second delay. Most likely it will be 10 or 20 seconds. After you are there, you can click on any of the tabs in my profile, none of which have content, and they will all cause a delay of a roughly fixed multiple of 10 seconds, give or take a second.

We can certainly involve IPB. As I mentioned before, I don't have the license key, nor do I have access to the system to collect the type of information they would request. I'm not sure if our support contract covers them logging into our system with low level privileges and diagnosing/fixing issues that may be specific to our setup.

One other recommendation I would make, going forward, is to have a system info page on the board, stating what software we are using, what versions of the software, what hardware and operating systems, etc., etc. This will help in making decisions in the future as well as aiding in troubleshooting. It can be as simple as a members only thread, I suppose.

David

#44 davidd

  • Guest, F@H
  • 328 posts
  • 1
  • Location:Minnesota

Posted 26 February 2009 - 08:02 PM

Just so I don't forget in the future, we should take a serious look at the Sphinx search engine. It has even been made into a Mod for IPB, so it should be reasonably plug-n-play.

Why? It dramatically speeds up searches and lowers load on the server in the process. It cuts down on table locks in MySQL. It allows for the possibility of converting *some* MySQL tables to use the InnoDB storage engine. This engine's main distinguishing feature (in our case) is that it does row level locking instead of table level locking, thus making a huge difference in concurrent access to tables. The downside is that InnoDB doesn't allow for indexed fulltext searches (they've been saying "soon" since 2005). The remedy? 3rd party search engines, with Sphinx being arguably the fastest (free) one out there. MySQL even hired the Sphinx developer (although he still retains the rights to the Sphinx code).

The forums here are all about sharing and finding information. The post data is only going to grow, so efficient searching will become more important as time goes on.

David

#45 davidd

  • Guest, F@H
  • 328 posts
  • 1
  • Location:Minnesota

Posted 26 February 2009 - 10:12 PM

Here are the same types of graphs I provided earlier, but these are for the entire 24 hour period of 02/24/2009. The previous versions were from the night of 02/23/2009 and morning of 02/24/2009. The times are in CST. I reduced the font size on the diamond used for plotting the points and I also removed points under 600 ms for the forum graph and under 700 ms for the home page graph (because Excel can only plot 32000 points and I was over that limit).

The forum page graph shows the same horizontal grouping at 10 second multiples as before. The home page graph gives a good idea of our performance/load over a full day period.

Remember, these aren't graphing regular user hits on the site, they are graphing a test tool's hits on the site. By seeing how long it takes for that tool, we have an indirect measurement of how long it took for real users at the same time, accessing the same page and we can generally assume that slower response times were due to user activity, although they are sometimes caused by batch processing in the data center as well as network congestion (although that wouldn't normally cause multiple second delays).

Attached File  20090224_full_day_main_forum_page_performance.JPG   98.95KB   16 downloads

Attached File  20090224_full_day_imminst_home_page_performance.JPG   107.42KB   19 downloads

David

#46 davidd

  • Guest, F@H
  • 328 posts
  • 1
  • Location:Minnesota

Posted 27 February 2009 - 08:51 PM

More general info for people to have if/when tuning takes place...

I ran the performance test against a handful of other websites that run IPB. This was against their main forum page, just like I did with ours. Our response time is almost always at 500 ms or above. A couple of the other sites had response times slightly lower than ours (maybe 300 ms and above), but a couple had response times that were ultra low after the first hit. These ultra low response times were in the under 50 ms range (mostly registering as 0 ms, but the timer is only accurate to 10 ms). And no, the page isn't going to load that fast for the end user, because it has to be rendered in the browser and that can take a second or longer. But by returning the information for the page ultra fast, their servers are able to handle more requests and reduce delays for all users.

The reason I ran the test was because I wanted to see if our fast times were still being hampered by some sort of built in, fixed latency. Even though our critical issue is the 10, 20, 30 second delays, if each hit doesn't *have* to take over 500 ms, then that would be a good thing to fix to provide a snappier response to the requests, *if* it was just a case of programmed latency and not one of slow hardware/network. After all, a 1 second time for a page to load is better than 1.5 to 2 seconds.

Now, my above test doesn't mean that we have a fixed latency issue (other than the 10 second latency issue), but there is a chance we do. The other possible cause in disparity between our site and others is that they may have some advanced caching implemented, whereas our site might be querying the database every time for the main forum page. IPB supports a number of caching packages and it is pretty easy to configure it to use them (just a line or two required in the config files). This type of caching will reduce load on the server and greatly improve response times vs. the default caching that is built into IPB.

These caching packages that are supported are: Eaccelerator, XCache, APC and MemCache. The advantage of the first 3 is that they can also be used to cache PHP code (done outside of IPB, since IPB doesn't need to change anything to support PHP caching). If we are currently using the built in caching, then any of these options will help. I'm sure there will be some minor differences in performance between them, but they should be a lot better than the built in caching.

As I mentioned, some of those packages can also be used to cache PHP code (the language IPB is programmed in). However, depending on whether Apache or IIS is being used and depending on your OS, there are various issues with using these for PHP caching that can cause instability. I'm not completely up-to-date on the issue though, so maybe it has been fixed.

I should mention that I found a tool that gives information on what we are using for our site:

http://toolbar.netcr...www.imminst.org

(look at the bottom of the page)

The results show that we are using Linux, with:

Apache/1.3.36
Unix mod_auth_passthrough/1.8
mod_log_bytes/1.2
mod_bwlimited/1.4
PHP/4.3.11
FrontPage/5.0.2.2635.SR1.2
mod_ssl/2.8.27
OpenSSL/0.9.7f

Unless the stability issues have been fixed with PHP optimizers, we'd probably have to leave that alone. But we can still use those same optimizers for database caching, which again is a built in feature of IPB and only requires a simple config file change.

Lastly, output that is sent from the web server to the browser can be cached. This is usually accomplished via a caching proxy server. This just sits between the browser and the web server and keeps a cache of common pages/images/etc. Apache has a built in module (oddly enough, named mod_proxy) that can be enabled, or third party proxy servers can be used.



Why all the talk of caching? Because memory access is much, much faster than disk access (and disk access faster than DB access). And because accessing something that is pre-computed is faster than re-computing something.

Some of the caching I mentioned may require an evaluation of how much available memory we have on the server(s) and the caching packages may need to be configured properly to not use too much or to use enough.

Increasing performance by using add-ons is generally a lot easier than hacking code. :) Many thousands of websites use these packages every day to improve performance and reduce hardware requirements.

I'm not suggesting that any of these caching tools will fix our 10 second delay issue. Although it is possible they could reduce the frequency of it, depending on what is going on under the covers. I'm just bringing it up as a general, relatively easy thing to do to improve the imminst.org website performance (forums and non-forums).


David

#47 davidd

  • Guest, F@H
  • 328 posts
  • 1
  • Location:Minnesota

Posted 02 March 2009 - 05:41 PM

Another recommendation...

In the spirit of good application development, it would be advantageous to keep track of all changes made to the system. It would be good to post these in some forum so that they won't be lost over time and so there is an easy way to go back and see what things were changed and when. This includes code changes, hardware changes, ISP changes, configuration changes, etc., etc..

This will make it much easier for future system admins/developers to support the site. An offline copy would be prudent as well, in case something corrupts the online copy. Although, I'm assuming we probably have a pretty good backup routine in place to minimize data loss.

David

#48 sentinel

  • Guest, F@H
  • 794 posts
  • 11
  • Location:London (ish)

Posted 02 March 2009 - 06:21 PM

David

Although much of what you say goes over my head I would just like to say I'm impressed by the amount of time and expertise you are dedicating to this. Good work ;)

#49 davidd

  • Guest, F@H
  • 328 posts
  • 1
  • Location:Minnesota

Posted 02 March 2009 - 07:32 PM

David

Although much of what you say goes over my head I would just like to say I'm impressed by the amount of time and expertise you are dedicating to this. Good work ;)


Thanks sentinel. I wish I was in a position to provide insight into medical studies and how the human body works. But, since information technology is where my day job is, I figured I might as well help out in that area. I don't have enough time in my life to dedicate much time to artificial intelligence (although if I won the lottery, that's what I'd do). Giving guidance on the imminst.org site issues is something I can do here and there, in discrete chunks, so I'm able to fit it into my life without too much impact.

Feel free to ask at any time if you would like clarification on anything I've posted. I'd be happy to explain.

David

#50 davidd

  • Guest, F@H
  • 328 posts
  • 1
  • Location:Minnesota

Posted 02 March 2009 - 07:54 PM

Today marks the 1 week anniversary of when the source of the main forum performance issue was identified (accessing user profiles causes delays in other forum page loads). The problem still exists today.

David

P.S. I'll post one of these every week just as a respectful reminder of the fact that we know what causes the issue, but it hasn't been fixed yet. This is just a friendly nudge to encourage people to fix the issue.

Edited by davidd, 02 March 2009 - 08:02 PM.


#51 Mind

  • Topic Starter
  • Life Member, Director, Moderator, Treasurer
  • 19,054 posts
  • 2,002
  • Location:Wausau, WI

Posted 02 March 2009 - 09:10 PM

Thanks for the reminder David! Keep it up.

#52 niner

  • Guest
  • 16,276 posts
  • 2,000
  • Location:Philadelphia

Posted 02 March 2009 - 09:51 PM

Has IPB been contacted about this yet? Now that we have a good idea what the problem is, they might well have a simple solution involving anything from a config change to disabling unneeded functionality.

#53 Mind

  • Topic Starter
  • Life Member, Director, Moderator, Treasurer
  • 19,054 posts
  • 2,002
  • Location:Wausau, WI

Posted 02 March 2009 - 09:53 PM

I will start a support ticket today and see what they have to say.

#54 caliban

  • Admin, Advisor, Director
  • 9,152 posts
  • 587
  • Location:UK

Posted 02 March 2009 - 11:53 PM

Today marks the 1 week anniversary of when the source of the main forum performance issue was identified (accessing user profiles causes delays in other forum page loads). The problem still exists today.


Dear davidd

thanks for all your work in trying to identify why the forums are not as fast as they should be. Assuming that the problem is as you suspect, it is endemic to all IP boards.

Helpfully, you suggest the following solutions.

1) Create a "Mod" (or a complete module, for that matter) for IP.Board and install it over the top of our installation [...]
2) Hack directly into the software we have installed [...]


We do not have the technical expertise to accomplish this safely.

3) Use a Mod that someone else has already created. I'm assuming that is one of the reasons that IP.Board was chosen, because it has a large installed base, hence there are a lot of Mods that others have created. Someone may want to review the Mods that are available to see if the functionality they provide will fit with some of the feature requests we want to implement. As mentioned in (1), these can be applied/rolled back as needed (assuming the Mod creator followed recommended practices).



Your assumptions on this point are not correct. The software was chosen for its functionality back at the inception of Imminst. Its distinguishing feature is that -because it is a proprietary package- there is not a large 'modding' community as with other forum packages. I have had a brief look, but could not find a publicly available modification that promises to resolve the issue you identify. If anyone finds one and can vouch for its safety, all of ImmInst would be very grateful.

4) Create pages that integrate in with the regular IP.Board pages, but use their own mechanisms to provide functionality/user interface/data. This mechanism may still be broken by an upgrade, but the code could be kept modular (if done well) such that pieces that don't work could be removed and others left in place.

I am unsure what you are suggesting here, but it may come back to the answer provided to 2) and 3).

As you know, we are actively looking for someone who can be employed to upgrade various aspect of software. Until such a person emerges and presents us with a solution, the issue of our forums reacting slowly will not be resolved. If the system is common to all Ip boards as you suggest, that person may even be able to strike a good deal with the owners of the software.

#55 davidd

  • Guest, F@H
  • 328 posts
  • 1
  • Location:Minnesota

Posted 03 March 2009 - 01:26 AM

Today marks the 1 week anniversary of when the source of the main forum performance issue was identified (accessing user profiles causes delays in other forum page loads). The problem still exists today.


Dear davidd

thanks for all your work in trying to identify why the forums are not as fast as they should be. Assuming that the problem is as you suspect, it is endemic to all IP boards.

Fortunately, that is not true. I have tested a handful of other boards and they do not have this issue.

Helpfully, you suggest the following solutions.

1) Create a "Mod" (or a complete module, for that matter) for IP.Board and install it over the top of our installation [...]
2) Hack directly into the software we have installed [...]


We do not have the technical expertise to accomplish this safely.


Do you mean (1) or (2)? If you mean doing it in the production instance, I would agree that is not a safe option, even if the IPB developers were doing it. Here is something I thought I posted in this thread, but it turns out it was in another thread. I've copied the relevant pieces here.

...
Until we can get a sandbox environment set up, I don't think we'll make much progress on enhancements/bug fixes/performance tuning. One person just isn't enough, no matter how good they are at what they do. There appears to be a bit of political history on this board (I say that as a newcomer and just witnessing various exchanges in posts). To be blunt (and not making any value judgment, but just calling it as I see it), I get the feeling that some people are worried about those who have access to the system reading private messages or accessing leadership-only data and sharing that information with others. To a lot of people, that would seem like a silly obstacle, but I can see where people might get very close to something (like ImmInst) and get caught up in personal differences of opinion, etc. The somewhat anonymous nature of the Internet can often bring out the worst in people.

By having a sandbox environment (that's a non-production environment used for testing), it would be possible to develop a mechanism to restore the production data to this environment, run some SQL to remove sensitive data (but leave the rest there, because real data is often needed for testing and really needed for performance tuning) and then let the developers have access to work on the enhancements. This is the general mechanism that thousands of companies use with their corporate systems. They have truly valid concerns about the data in production systems falling into the wrong hands, because it often contains customer data.

Once a piece of development is done, it can be tested in the sandbox environment and then put into the production environment if it is working well. If it is found that it has some adverse effect on the production environment after being migrated, it can be taken back out and worked on more or (if it isn't causing a huge issue) can be worked on in the sandbox to fix the issue and pushed to production again.

Invision Power Board (and Gallery and Blog, etc.), the software we use, does come with a production license and at least one test license. We may even be able to negotiate for more test licenses, which would make it even easier for multiple developers to work in their own sandboxes and improve the efficiency of development efforts.

This test environment could be set up on the same hardware as the production environment. This may still worry those who worry, because the developers would still have access to the machines (operating system access), even if they didn't have a login to the production database instance. Alternatively, another system could be put together and the test instance(s) could be installed on that system. It would not need to be as powerful as the production system (meaning cheaper). It could even be run from one of the developer's computers, provided they have the required network setup to allow access. Or maybe our hosting company would even allow doing this on some other machine in their arsenal.
...

3) Use a Mod that someone else has already created. I'm assuming that is one of the reasons that IP.Board was chosen, because it has a large installed base, hence there are a lot of Mods that others have created. Someone may want to review the Mods that are available to see if the functionality they provide will fit with some of the feature requests we want to implement. As mentioned in (1), these can be applied/rolled back as needed (assuming the Mod creator followed recommended practices).



Your assumptions on this point are not correct. The software was chosen for its functionality back at the inception of Imminst. Its distinguishing feature is that -because it is a proprietary package- there is not a large 'modding' community as with other forum packages. I have had a brief look, but could not find a publicly available modification that promises to resolve the issue you identify. If anyone finds one and can vouch for its safety, all of ImmInst would be very grateful.

Are you saying that the software was chosen because someone thought there would be security in not having add-ons available? If so, I would disagree with that logic. Software is almost never chosen for this reason.

I was not suggesting that there would be a fix to the performance issue available as a mod. I've made several general recommendations in this thread about how development could be done, whether that is for fixing bugs, making functional enhancements, or performance tuning. The piece you are quoting here was one where I was offering advice on options for how to safely make changes to the website software. It was not specific to the user profile issue.

As for Mods available for IPB, there are many out there, despite the fact that it is pay software. It used to be free software. I'm not sure if that had any effect on the number of Mods, but there are many available. There may be functionality that the Imminst members/users ask for that could be satisfied by one of these Mods, which would be easier than re-inventing the wheel and coding our own enhancement.

4) Create pages that integrate in with the regular IP.Board pages, but use their own mechanisms to provide functionality/user interface/data. This mechanism may still be broken by an upgrade, but the code could be kept modular (if done well) such that pieces that don't work could be removed and others left in place.

I am unsure what you are suggesting here, but it may come back to the answer provided to 2) and 3).


I'll give you an example. Let's say we want to add functionality that shows which topics have had the most posts/minute for a given forum (this is a contrived example..I'm not sure if this functionality would be in high demand). With option 4, this could be satisfied by creating a whole new page that makes use of some custom back-end programming to directly query the database to acquire the data. This would be outside the IPB framework. It wouldn't involve any changes to the IPB software. It would just be making use of the data stored in the IPB database.

As I stated in my post, my preference is option (1), where we create our own Mod. I'd put option (4) next in line, unless there was an existing Mod for the functionality we needed, then that would come next. Lastly would be directly making changes to the IPB code without creating a Mod for it, for the reasons I outlined above.

As you know, we are actively looking for someone who can be employed to upgrade various aspect of software. Until such a person emerges and presents us with a solution, the issue of our forums reacting slowly will not be resolved. If the system is common to all Ip boards as you suggest, that person may even be able to strike a good deal with the owners of the software.

I did not suggest that the problem is common to all IPB installations. It is not. We have some issues that are common to *some* installations and one issue (the 10, 20, 30 second user profile delays) that I have not seen on any of the boards I tested (which doesn't mean that it isn't happening to another board, but I did not see it).



If the plan is to find one person to do all the work, I would respectfully offer up that is a plan that would slow progress. That plan would never be employed in a corporate Information Technology (IT) group at a company. Generally, more people means faster turnaround and more options and more insurance in case people leave (which does happen over time). We are fortunate that there are a number of people with experience who are willing to offer their time for free. I believe it would be a mistake to not take people up on this offer.

The suggestions I am making are based on many years of experience in IT. I got the impression that such experience was lacking, so I'm offering it up so that others may see how things are done and adopt best practices rather than reinventing the wheel and possibly heading down a bad path.

I'm in a rush here, so I don't have a lot of time to read over what I just wrote. Nothing I wrote here was done so with the intent to create bad feelings. My intentions are to show how it is possible to fix our current, biggest issue as well as make other changes going forward, so that we may improve the board. This current performance issue, I am positive, has already lost some potential users/members. Many studies have been done showing how fickle people are when it comes to slow response times on websites. I think we can all agree that more members/users is a good thing and not something we want to prevent.

The current issue could have been fixed a week or two ago. I'm not saying this to be confrontational. I'm staying it to point out that the current method of how things are done is not working. This generally points toward a need to change how things are done.

David

Edited by davidd, 03 March 2009 - 01:26 AM.


#56 davidd

  • Guest, F@H
  • 328 posts
  • 1
  • Location:Minnesota

Posted 03 March 2009 - 01:27 AM

I will start a support ticket today and see what they have to say.



Good plan!

David

#57 caliban

  • Admin, Advisor, Director
  • 9,152 posts
  • 587
  • Location:UK

Posted 03 March 2009 - 06:45 PM

Dear David, your offer to help is greatly appreciated. We can certainly look at setting up a "sandbox" -- we can put a second installation of the board on the server for testing purposes. Would it be sufficient if we installed a blank copy of the forum software on a subdomain on the webspace and gave you and any other developers ftp access to that installation? Presumably, you would need database access that may be a bit more difficult, but could also be arranged.

The only issue is that you'd have to work with a blank slate - we cannot "run some SQL to remove sensitive data". If solution does not work, we'd need proper contracts with those who will have access to the data. Again, as the board is keen to resolve technical issues, we could spend some of ImmInst sparse treasure to make sure that people who help out on this are not left out of pocket.

(Personally, I'd probably have a more radical preference: if someone could suggest a way of migrating away from IPboard without losing any data or functionality, this may help with better CMS and subscription integration.)

#58 Mariusz

  • Guest
  • 164 posts
  • 0
  • Location:Hartford, CT

Posted 03 March 2009 - 07:20 PM

As you know, we are actively looking for someone who can be employed to upgrade various aspect of software. Until such a person emerges and presents us with a solution, the issue of our forums reacting slowly will not be resolved. If the system is common to all Ip boards as you suggest, that person may even be able to strike a good deal with the owners of the software.


Is this some kind of a joke?
At least 5 people offered their expertise (including me ) for free! So far not one person was "employed" to this important job.
This project could have been fixed few months ago yet we are still waiting for decsion from the allmighty board of directors.

Mariusz

#59 Mariusz

  • Guest
  • 164 posts
  • 0
  • Location:Hartford, CT

Posted 03 March 2009 - 07:37 PM

The only issue is that you'd have to work with a blank slate - we cannot "run some SQL to remove sensitive data". If solution does not work, we'd need proper contracts with those who will have access to the data.


I don't think we will be able to fix too much without access to shell, or at least ftp and sql.

Mariusz

#60 Mind

  • Topic Starter
  • Life Member, Director, Moderator, Treasurer
  • 19,054 posts
  • 2,002
  • Location:Wausau, WI

Posted 03 March 2009 - 09:28 PM

I am all for getting a team of people together: Maruisz, Davidd, Maestro949, maybe Harvey. Those are who I have in mind. I would like Lightowl to organize and lead the effort, however, he seems to have disappeared. Anyone hear from him lately? I am sure you will appreciate having some sort-of direction and having a person responsible for reporting changes, progress, such things. If I don't hear from Lightowl by Thursday I will push to give ftp, sql, shell access to someone else or the whole group.




0 user(s) are reading this topic

0 members, 0 guests, 0 anonymous users