[olug] Need assistance from apache gurus

Jon Larsen jon at jonlarsen.us
Tue Sep 20 14:36:55 CDT 2016


Dan,

The VMWare setup is maintained by our colo provider.  My access is limited
to only my VMs. (the web interface only)  I can made some hardware changes
to my VMs, but that's it.  The web interface used to be different, but it's
been re-branded and the standard VMware web interface has replaced it.

I have seen sluggish response from an unused system once or twice, where it
looks like a high load, but top shows 0.00.  I don't doubt that neighbors
are having an influence.

I've split the load balancer config between 80 and 443, so that the three
new webservers are only handling HTTPS traffic, while the 3 previous are
handling HTTP.
The populated the apc cache by browsing around to each server using
/etc/hosts entries (via a vpn connection).  I'm having the product team
keep an eye on it, and they happy in a day or two, I'll roll the new ones
onto port 80 as well.

I did a comparison between the output of sysctl -a on three different web
servers.

First was the web server I plan on replacing. (cent 5)
Second was the newly built. (cent 6)
Third was another webserver in a different colo providers VMware
environment, which has had no problems at all since it was built 18 months
ago. (cent 7)

There are some differences, but not enough that it should be all together
too much difference.

I hope to follow up as things progress.

Jon L.

On Tue, Sep 20, 2016 at 12:44 PM, Dan Linder <dan at linder.org> wrote:

> If it's semi reproducible, can you setup WireShark on each system and on
> the back-end port of the load-balancer to see if the packets are suffering
> from a retransmit or plainly not making it between the two ethernet ports?
>
> Are all VMs (load balancer and web servers) on the same hypervisor, or can
> the be made to be?  That should eliminate any external networking issues
> (which you'd probably be blind to at the VM level).
>
> Are you able to see into the health of the hypervisors themselves?  Is it
> possible that it is oversubscribed and your "neighbors" on the server are
> consuming all the free resources?
>
> Dan
>
> On Tue, Sep 20, 2016 at 11:25 AM, Jon Larsen <jon at jonlarsen.us> wrote:
>
> > Well, with all the tweaking to the kernel settings, including change
> > fs.file-max to 100000, I'm not anywhere closer to the seemingly random
> > timeouts.  I can scoot along at a good pace, then suddenly apache is slow
> > to respond, then it's back fast again.
> >
> > It's rather frustrating.
> >
> > Jon L.
> >
> > On Mon, Sep 19, 2016 at 8:11 PM, Jon Larsen <jon at jonlarsen.us> wrote:
> >
> > > Jay -
> > >
> > > I upped the Nofile and nproc last week to 10240 for each.
> > >
> > > I just turned off iptables, and it seems faster - it hasn't hesitated
> on
> > > me yet.  I've compared the two iptables config files, and they're
> pretty
> > > much the same, so it's possible its something in the netfilter
> settings.
> > > The centos 5 system didn't have an entry for
> > /proc/sys/net/nf_conntrack_max
> > > but on centos 6 it was 65536.
> > >
> > > The engineers at the colo suggested I change the VMware hardware
> settings
> > > so the NIC was vmxnet3 instead of e1000.
> > >
> > > I went through sysctl.conf and duplicated the settings over from
> centos 5
> > > shortly after you sent your reply.
> > >
> > > sysctl
> > >
> > > net.ipv4.ip_forward = 0
> > > net.ipv4.conf.default.rp_filter = 1
> > > net.ipv4.conf.default.accept_source_route = 0
> > > kernel.sysrq = 0
> > > kernel.core_uses_pid = 1
> > > net.ipv4.tcp_synack_retries = 2
> > > net.ipv4.conf.all.send_redirects = 0
> > > net.ipv4.conf.default.send_redirects = 0
> > > net.ipv4.conf.all.accept_source_route = 0
> > > net.ipv4.conf.all.accept_redirects = 0
> > > net.ipv4.conf.all.secure_redirects = 0
> > > net.ipv4.conf.all.log_martians = 1
> > > net.ipv4.conf.default.accept_source_route = 0
> > > net.ipv4.conf.default.accept_redirects = 0
> > > net.ipv4.conf.default.secure_redirects = 0
> > > net.ipv4.icmp_echo_ignore_broadcasts = 1
> > > net.ipv4.tcp_syncookies = 1
> > > net.ipv4.conf.all.rp_filter = 1
> > > net.ipv4.conf.default.rp_filter = 1
> > > net.ipv6.conf.default.router_solicitations = 0
> > > net.ipv6.conf.default.accept_ra_rtr_pref = 0
> > > net.ipv6.conf.default.accept_ra_pinfo = 0
> > > net.ipv6.conf.default.accept_ra_defrtr = 0
> > > net.ipv6.conf.default.autoconf = 0
> > > net.ipv6.conf.default.dad_transmits = 0
> > > net.ipv6.conf.default.max_addresses = 1
> > > kernel.exec-shield = 1
> > > kernel.randomize_va_space = 1
> > > fs.file-max = 65535
> > > kernel.pid_max = 65536
> > > net.ipv4.ip_local_port_range = 2000 65000
> > > kernel.msgmnb = 65536
> > > kernel.msgmax = 65536
> > > kernel.shmmax = 68719476736
> > > kernel.shmall = 4294967296
> > >
> > > This will be behind another firewall and load balancer, so I may be
> able
> > > to skip by without iptables, but I hate that idea.  I've always had
> > > firewalls on my systems, no matter the environment.
> > >
> > > Quick ab test (8 GB RAM, 4 VCPUs)
> > >
> > >
> > >
> > > Server Software:        Apache/2.2.15
> > > Server Hostname:        xxxxxxxxxxxxxxxxxxxxxxx
> > > Server Port:            80
> > >
> > > Document Path:          /
> > > Document Length:        56158 bytes
> > >
> > > Concurrency Level:      5
> > > Time taken for tests:   27.745 seconds
> > > Complete requests:      10
> > > Failed requests:        9
> > >    (Connect: 0, Receive: 0, Length: 9, Exceptions: 0)
> > > Total transferred:      566215 bytes
> > > HTML transferred:       562375 bytes
> > > Requests per second:    0.36 [#/sec] (mean)
> > > Time per request:       13872.730 [ms] (mean)
> > > Time per request:       2774.546 [ms] (mean, across all concurrent
> > > requests)
> > > Transfer rate:          19.93 [Kbytes/sec] received
> > >
> > > Connection Times (ms)
> > >               min  mean[+/-sd] median   max
> > > Connect:       60   63   4.3     60      69
> > > Processing:  4193 13697 9880.6  22881   23291
> > > Waiting:     3713 13189 9643.0  21051   22720
> > > Total:       4256 13759 9877.3  22940   23351
> > >
> > > Percentage of the requests served within a certain time (ms)
> > >   50%  22940
> > >   66%  23059
> > >   75%  23067
> > >   80%  23227
> > >   90%  23351
> > >   95%  23351
> > >   98%  23351
> > >   99%  23351
> > >  100%  23351 (longest request)
> > >
> > > (home page is dynamic content, 2.5 MB in size.)
> > >
> > > I'll be doing some more testing when I get back in the office in the
> > > morning.
> > >
> > > Jon L.
> > >
> > >
> > >
> > > On Mon, Sep 19, 2016 at 4:48 PM, Jay Bendon <jaybocc2 at gmail.com>
> wrote:
> > >
> > >> How do TCP settings compare from centos5 to centos6?
> > >>
> > >> Are you getting a lot of syn drops? (netstat -s |grep -i dropped)
> > >>
> > >> NoFiles limits reasonable?
> > >>
> > >> sanity check your somaxconn and tcp_max_syn_backlog and rmem wmem, and
> > >> nf_conntrack_max (if using iptables) settings for your application
> (and
> > >> compare to centos5)
> > >>
> > >> Just spitballing some of the common reasons for connectivity issues
> > under
> > >> loads
> > >>
> > >> --Jay
> > >>
> > >> On Mon, Sep 19, 2016 at 2:15 PM, Jon Larsen <jon at jonlarsen.us> wrote:
> > >>
> > >> > I have a weird issue on my plate.
> > >> >
> > >> > I have three apache web servers behind an ldirector load balancer
> > >> running
> > >> > centos 5.x on VMWARE.  I've built three new centos 6.x web server
> VMs
> > to
> > >> > replace them.
> > >> >
> > >> > I used the same apache configs, as the apache versions don't change
> > much
> > >> > between 5 and 6.
> > >> >
> > >> > I'm encountering intermittent network disconnects when I use the new
> > >> three
> > >> > centos 6 systems in production, forcing me to back peddle to the
> older
> > >> cent
> > >> > 5 systems.
> > >> >
> > >> > The disconnects appear at random, and no concurrent high CPU load.
> > >> >
> > >> > The disk scheduler is already set for deadline, and I'm using the
> > >> suggested
> > >> > VMware 'vmxnet3' nic adapters.
> > >> >
> > >> > I've tried several profiles of prefork settings, but encounter the
> > same
> > >> > issue.
> > >> >
> > >> > Currently, they are set to:
> > >> >
> > >> > StartServers 100
> > >> > MinSpareServers 30
> > >> > MaxSpareServers 40
> > >> > ServerLimit 220
> > >> > MaxClients 220
> > >> > MaxRequestsperChild 2000
> > >> >
> > >> > Any ideas?
> > >> >
> > >> > Jon L.
> > >> > _______________________________________________
> > >> > OLUG mailing list
> > >> > OLUG at olug.org
> > >> > https://lists.olug.org/mailman/listinfo/olug
> > >> >
> > >> _______________________________________________
> > >> OLUG mailing list
> > >> OLUG at olug.org
> > >> https://lists.olug.org/mailman/listinfo/olug
> > >>
> > >
> > >
> > _______________________________________________
> > OLUG mailing list
> > OLUG at olug.org
> > https://lists.olug.org/mailman/listinfo/olug
> >
>
>
>
> --
> ***************** ************* *********** ******* ***** *** **
> "If you wish to make an apple pie from scratch,
>   you must first invent the universe."
>   -- Carl Sagan
>
> "Quis custodiet ipsos custodes?"
>     (Who can watch the watchmen?)
>     -- from the Satires of Juvenal
>
> "I do not fear computers, I fear the lack of them."
>     -- Isaac Asimov (Author)
> ** *** ***** ******* *********** ************* *****************
> _______________________________________________
> OLUG mailing list
> OLUG at olug.org
> https://lists.olug.org/mailman/listinfo/olug
>


More information about the OLUG mailing list